HUME LIBRARY INSTITUTE OF FOOD AND AGRICULTURAL SCIENCES UNIVERSITY OF FLORIDA Gainesville Digitized by the Internet Archive in 2013 http://archive.org/details/fieldplottechniqOOIeon X ^ O CK 4 FIELD PLOT TECHNIQUE by WARREN H. LEONARD, M. Sc. (Nebraska) Professor of Agronomy Colorado State College and ANDREW G. CLARK, M. A. (Colorado) Professor of Mathematics Colorado State College Copyright 1939 by Warren H. Leonard and Andrew G. Clark 1946 Printing BURCESS PUBLISHING CO. 426 SOUTH SIXTH STREET MINNEAPOLIS 15, MINN. • 3AU ***' c«v YSTtf*' PREFACE This manual has "been the outgrowth of a set of lectures on Field Plot Technique given to seniors and graduate students at Colorado State College since 1930* It has heen found practical in the classroom for a 2 to k credit combined lecture and labor- atory course. The problems and questions have proved to be important aids to the student. While "Field Plot Technique" has been prepared primarily for clas3 use, it is hoped that it will appeal to the technical worker in Agronomy as a reference to the more important statistical methods and tables. The large number of references quoted will give the reader a ready reference to the major papers on various phases of applied statistics. The organization of the subject matter, and the manner in which the statistical methods are interwoven with the applications, differs somewhat from the conventional approach. The writers feel that the student of agronomic experimentation needs an elementary picture of the factors to be considered in a research program with special reference to the field experiment. For this reason, an attempt has been made to coordinate the historical and logical background of agronomic experimentation with statistical techniques and their application to the design of the practical types of field experiments. This also requires that the student be familiar with the mechan- ical procedures generally followed in routine experimental work. The development of the various statistical techniques has been intuitive rather than rigorously mathematical. The aim has been to lead the student to understand the formulas he applies without necessarily being able to derive them mathematically. The symbolism employed in the text was chosen with regard to what appears to be the most common usage. Considerable effort has been spent in striving for consistency. That it is impossible in an elementary text to present and interpret many of the complexities involved in some modern experiments is obvious. It is hoped that a sufficient foundation will be laid for the student so that he can intelligently study the more advanced treatises. The writers are deeply indebted to Dr. F. R. Immer, Professor of Agronomy and Plant Genetics, University of Minnesota, for permission to make liberal use of his classroom material, especially in chapters 11, 17, and 18. They wish to express their appreciation to Dr. S. C. Salmon, Division of Cereal Crops and Diseases, U. S. Department of Agriculture, for criticisms and helpful suggestions. Dr. K. S. Quisen- berry of the same division has assisted by his criticisms of chapter 21. The wri- ' ters are particularly grateful to Professor R. A. Fisher and his publishers, Oliver and Boyd, for permission to reproduce the Table ofx 2 from "Statistical Methods for Research Workers." Professor G. W. Snedecor, Iowa State College, generously allowed us to include his table of "F and t". The writers also wish to express their thanks to Dr. C. I. Bliss for permission to use his table of angular transformations. The table of Weparian logarithms used in the manual is taken from "Four Figure Mathemati- cal Tables" by the late J. T. Bottomley and published by Macmillan and Co., Ltd. (London) . The writers are grateful to the publishers and to the representatives of the author for permission to use this table. To Dr. D. W. Robertson, one of their colleagues, they express their appreciation for various helpful suggestions. S) TABLE OF CONTENTS Part I. Introduction to Experimentation CHAPTER I Status of Agronomic Research .- ...... _ ... 1 II History of Basic Plant Sciences 9 III Logic in Experimentation _ 17 IV Errors in Experimental Work 28 Part II. S tatistical Analysis of Data V Frequency Distributions and their Application 37 VI Tests of Si gni f 1 nance 5 k VII The Binomial Distribution and its Applications 70 tJFHl The X 2 Test of Goodness of Fit and Independence. 75 IX Simple Linear Correlation 87 - X The Analysis of Variance 103 XI Covariance with Special Reference to Regression 113 Part III. F ield and Other Agronomic Experime nts XII Soil Heterogeneity and its Measurement 131 XIII Size, Shape, and Nature of Field Plots lAl XIV Competition and other Plant Errors 135 • XV Design of Simple Field Experiments I67 XVI Quadrat and other Sampling Methods 186 XVII The Complex Experiment - 195 XVIII The Split -Plot Experiment . 211 XIX Confounding in Factorial Experiments „ 221 XX Symmetrical Incomplete Block Experiments 250 XXI Mechanical Procedure in Field Experiments , 238 Part IV. Appendix TABLE 1 Normal Probability Integral Table 251 2 Table of "F" and "t" 25^ 3 TheX2 Table 258 Table of One -half Naperian Logarithms 259 5 Table of the Angular Transformation 2o2 6 Table of Random Numbers 266 7 Index ii > ■ FIELD PLOT TECHNIQUE Part I Introduction to Experimentation CHAPTER I STATUS OF AGRONOMIC RESEARCH I. Rise of Agronomic Research Although the art of agronomy has "been practiced for centuries, the science of agrono- my is only about 100 years old. The need for reliable information in this country has come about gradually as farmers have come to realize some of the many problems which confront the agricultural industry, problems in soil fertility, the control of diseases and pests, winter -hardiness in crops, among many others. In addition to the needs of the farmers themselves, the establishment of the Land -Grant Colleges under the Morrill Act in 1862 brought about an acute need for subject matter for the agricultural colleges . It soon became very apparent that the problems in agriculture were complex and that well -trained men were needed to solve them. In general, it may be said that agricultural research began with simple empirical tests, but has gradually developed until it has now attained a scientific basis. In the short space of 75 years, so much subject matter has been accumulated in the field of agriculture that no one man could hope to be familiar with all of it. This led to specialization within the field between 1°00 and 1910 in America. The branches recognized in most agricultural colleges and experiment stations are: Agronomy, animal husbandry, horticulture, entomology, forestry, home economics, and veterinary medicine or path- ology. Agronomy as a science was developed from the old style variety trials, crop rotation tests, and soil culture experiments, when field culture was an empirical art. Re- search workers and others interested in the science of crops and soils formed the American Society of Agronomy in 1907. In regard to Agronomy, Carleton (1907) states: "As a science it investigates anything and everything concerned with the field crop, and this investigation is supposed to be made in a most thorough manner, just as would be done in any other science". Thus, agronomy is the laboratory and workship of many sciences: Agrostology, chemistry, botany, ecology, genetics, pathology, physics, physiology, and others concerned with the problems of crops and soils. Ball (1916) early observed that it has been necessary for the experimenter (in agronomy) to turn from the gross aspect to minute detail in order to solve some of Its problems. Empirical knowledge has been rapidly supplemented by fundamental in- formation as a result of organized research and the improvement in its technique. II. Establishment of Experiment St ations It is difficult to realize that the present large network of experiment stations in this country and in other parts of the world has been established in the past IOC years. In fact, the science of agriculture practically began with this movement. (a) First Experiment Station Jean Baptiste Boussingault established the first experiment station in 183^, being the first man to undertake field experiments on a practical scale. He farmed land at Bechelbronne, Alsace, where he carried on research of a high calibre. Bous- singault set out to investigate the source of nitrogen in plants, and systematically weighed the crops and the manures applied for them. He analyzed both and prepared a balance-sheet. Furthermore, this investigator studied the effects on plants when legumes were in the rotation. He concluded that plants obtained most of their nitro- gen from the soil. (See Chapter 2.) -1- (°) Rothamst gd Expe rimental Stat i on The Rothamsted Experimental Station was established --"by John Bsnnot Lowes on his farm in England in iQKl . Hall (1905), in his account , states that "Rothamsted is now a household word wherever the science of agriculture is studied." Lawes found that phosphates wore important fertilizers and discovered a method to make phosphate fertilizer "by the application of sulfuric acid to phosphate rock. Formerly hones were used as a sole source of phosphates. This significant discovery led to experimentation on a large scale. The systematic field experiments, "begun in 18^5 and continued to this day, have dealt particularly with soil fertilizers and crop rotation. These experiments long have teen models for carefully planned, experiments. Lawes was aided by Dr. J. H. Gilbert, who commenced work at Rothamsted in 1.8^3 • The two men worked together for 57 years. Recently, Dr. R. A. Fisher has brought about modifications in the field experiments to make them amenable to statistical treatment. ( c ) American Expe riment Stations Some of the early history of American experiment stations is given by True (1937) and by Shepardson (1929). South Carolina went on record as favoring an ex- periment station in 1785, but the general movement for the establishment of experi- ment stations began about I87I because of the attention attracted by the experiments of Lawes and Gilbert of England. In the meantime, the Morrill Act signed by Lincoln in 1862, provided for the so-called land-grant colleges for the study of agriculture and mechanic arts. California established the first experiment station in lQl~), and I began field experiments on deep and shallow plowing for cereals. A station was started in North Carolina in I877, after which many others followed. The Hatch Act, passed by Congress in I087, was the start of the present experiment stations. Twelve were in existence at that time. Increased funds were provided by the Adams Act in 1906, by the Purnell Act in 1925, and by the Bankhead -denes Act, in 1255. The United I States Department of Agriculture has been of rather recent origin, the Secretary be- coming a Cabinet member in I889. At present, the federal government controls funds given to the states for experimental work. In general, the system has been satis- factory because it has proved, to be participation and coordination rather than con- trol. Ill . Reasons for Publi c Support of Agr icultural Research There has been some criticism on the use of public funds for agricultural research, but their use has been justified on the grounds that the welfare of agriculture is basic to the nation. In addition, it would be almost impossible to place agricultur- al research on a self-supporting "basis because the results of research are so diffi- cult to control through patents or otherwise. ( a) Agric ultural Welf are There" are 6,000,000 farmers and 360,000,000 to 365,000,000 acres of culti- vated land in this country, some of which ha3 been cultivated more than 300 years. The virgin fertility, in many cases, has been exhausted. The experiences and needs of these farmers are significant because the prosperity of the nation depends to a large extent upon agriculture. The production of food and fiber is fundamental to the public welfare, as research that leads to lower cost of production passes its benefits on to the consumer. Haskell (1923) calls attention to the fact that, in the case of crop losses due to diseases and other factors, the consumer ultimately pays a higher price for his food. He pays for depleted soil fertility in the same way. Thus, the state may actually gain more from the benefits of research than the farmer himself. (b) Limitations of Farm Experience s It has been impossible for several reasons to collect scientific information of much value from farm experiences, (l) Inadequate Farm Records : The results ob- tained by farmers are inaccurately and incompletely recorded from the experimental viewpoint. Their experiences are generally limited to acres and yields such as found in stories in the farm press. Farmers very often place undue emphasis on the unusual. (2) Failure to Consider all Factors ; The essence of scientific progress is to determine "why". Among the many variables in agriculture, variation in season is exceedingly important and may over-shadow all other factor's. The farmer is quite likely to base his judgment and conclusions on the results of one or two year's per- formance. Thome (1909) states that many experiments which farmers attempt are valueless or misleading "because of failure to observe some essential condition of experimentation. (3) Inadequate Training ; As a rule, the farmer lacks the training or experience necessary for the evaluation of experimental results. Hall (1905) makes this statement : "Agricultural science involves some of the most complex and difficult problems the world is ever likely to have to solve, and if it is to con- tinue to be of benefit to the farmer, investigations, so far as their actual conduct goes, must quickly pass into regions where only the professional scientific man can hope to follow them ...." (h) Inadequate Funds ; Farmers lack the funds, help, and equipment necessary for experimental work. Experimentation is quite expensive since practical considerations are necessarily put aside. An experiment must be conducted with precision in order to obtain reliable results, rather than for financial return. For instance, Hall (1905) tells that some F.othamsted fields have grown wheat for 60 years, year after year, on the same land. As the modern farmer seldom grows wheat continuously, he looks upon this experiment as hopelessly impractical when it is pointed out to him on field days. Nevertheless, this very test furnished the bulk of the early proof that losses in yield would result from continuous wheat culture. The aim of the Fothamsted test, as it continues, is to find out how the wheat plant grows . IV. Experiment Station Funds Agricultural research in this country is publicly financed almost altogether. Feder- al and state agencies spent about 25 million dollars on agricultural research for the year 1927-28. This total sum represented approximately 0.20 percent of the gross income for agricultural- products, a figure wholly within reason. ^a) The Hatch ActV ~~TTTe f IrKtrTederal subsidy for agricultural research was the Hatch Act, passed in I887. It gave each state $15,000 per year, a wide latitude in the use of the funds being permitted. The Act made it possible to conduct original experiments or verify experiments along lines as follows: (1) physiology of plants and animals; (2) Diseases of plants and animals with remedies for the same; (5) the chemical com- position of useful plants at different stages of growth; (k) rotation studies; (5) testing the adaptation of new crops and trees; (6) analyses of soils and water; (7) chemical composition of manures, natural and artificial, and, their effect on crops; (8) test the adaptation and value of grasses and forage plants; (9) test the composition and digestibility of different foods for domestic animals; (10) research .on butter and cheese production; and (11) examination- and . classification of soils. None of the funds can be used for the purchase or rental of lands or expenses for farm operations . (b) The Adams Act A similar amount of money was granted to the states by the Adams Act, passed in 1906. The funds must be used for original researches or experiments that bear directly on agriculture. Research of a fundamental nature is required under this -forid. Norjs .of the money can be applied to substations, or to the purchase or rental of land . (c) The Purnell Act The Purnel Act passed in 1925 provided for additional funds which now amount to $60,000 per year for each state. These funds must "be used on specific projects, hut the requirements are lees exact than for the use of Adams funds . The Act pro- vides for investigations on the production, manufacture, preparation, use, distribu- tion, and marketing of agricultural products. ( &) The Bankhead -Jones Act ' Certain difficulties in the use of experimental funds for broad general pro- jects led to the passage of the Bankhead -Jones Act in 1935 which will, in five years (19^0), provide $5,000,000 for research. Its provisions have "been described as follows: "To conduct scientific, technical, economic, and other research into laws and principles underlying basic problems of agriculture in its broadest aspects....". It also authorizes research for the improvement of quality of agricultural commodi- ties and for the discovery of uses for farm products and by-products. The U. S. Department of Agriculture receives ko per cent of this fund, while 60 per cent is allotted to the states on the basis of rural population. It is generally understood that the funds must be used for new lines of work. V . The Personal Equation in Research As for agricultural research in general, successful agronomic research depends upon the ability, permanency, and honesty of the workers. The personnel for investiga- tional work must be well-trained in the basic sciences as well as leaders in agricul- tural thought. Their outlook must be broad. (a) Education for Investigatio nal Work The amount of training necessary for research is great. The investigator must be skilled in the art of agronomy and trained in the closely related sciences. In fact, he should have an adequate educational background before research is even attempted. A good foundation in English, physics, and chemistry are basic for all research in agriculture. Biology adds the conception of organism, while mathematics is the common instrument. Thorough training in all branches of botanical science is desirable in agronomy. This includes taxonomy, anatomy, physiology, pathology, etc. Other sciences that are useful are: Geology, bacteriology, genetics, and statistics. Among the authorities who agree on this general type of background are Howard (1924), Wheeler (19U), Ball (1916), Carleton (1907), and Richey (1937). A practical view- point is necessary, but this is largely the result of boyhood training and common sense. (b) Qualities in Successful Research Men There is some question about the successful scientist necessarily being a genius. The term should be qualified to include perseverance, common sense, and in- finite pains. Howard (1924) emphasized the qualities needed when he said: "Here the man is everything; the system is nothing." (1) Imagination : Some imagination is essential in the research worker but, of course, it must be scientific imagination. (2) D i s cr imlnat i on : An investigator must have the power of discrimination, that is, he must be able to recognize the essentials and non-essentials in research. He must select the features which are most worthwhile. It is possible to record too much data on a subject, and thus cloud the entire issue. (3) Accuracy : There is a great need for accuracy in' experimental work. An investigator should record only those notes whose reliability is well established. One should never take measurements so fine that they imply false accuracy. The figures taken by an investigator should give him confidence in his work, (k) Honesty in observatio n:. The investigator should always accept observations without regard to their agreement with his own precon- ceived ideas. One should record only the things he sees. (5) Fairness : The re- search man should give due credit to others, and keep within his own field unless a phase of his work calls for cooperation with others. (6) Enthusiasm: One should he enthusiastic about his vork, being ready to put in long hours or extra time when necessary. Call (1922) says there must be a love for the work so great, in those engaged in research, that it will enable him to push forward in the face of obsta- cles which may seem insurmountable. (7) Courage: One should always have the cour- age of his .convictions. He should not be afraid to try something new. ( c ) Initiative in Experimental Projects The project system has an enormous value in the coordination, continuity, and conclusion of agricultural experimental work because it requires the submission of an outline and its approval before any work is done. Success in experimental projects depends upon the leader, his scientific attitude, depth of motive, concep- tion of the problem, and its requirements. Not all research is good. In fact, there is a chance for much waste. . While partial failure is inevitable, it is possi- . ble for the investigator to gauge plausible success. Allen (1930)? advises research workers to think scientifically, avoid adherence to routine, and keep abreast of the times. The investigators should avoid the belief that his own compartment is water tight and self-sufficient. VT. Results of Agron o mic Research Many contributions have been made in crops and soils by the experiment stations, particularly during the past 25 years. Some of the more important advances in the past quarter of a century in field crops are summarized by Warburton (1933) while those in soils are given by Lipman (1933)* (a) Field Crops Among the contributions in corn have been the discovery that the show -type ear is unrelated to its performance in the field, that ear-to-row breeding may not lead to improvement in corn yields, and that the combination of inbred lines in hybrids has resulted in higher corn yields. In wheat, the discovery of rust resist- ance and of physiologic races has enabled investigators to breed for resistant varieties. The seme is true for bunt. The introduction and use of sorghums, as well as their improvement, has resulted in their production throughout the west. The : ' cause of flax "sickness" has been discovered as due to wilt with the result that re- sistant varieties have been bred. Sweet clover, once a weed, has been found to be a valuable crop. Many improved varieties of crops have been developed for disease re- sistance, drouth resistance, high quality or high yield. Marquis wheat is one of the most widely known improved varieties. Tillage has been shown to be beneficial because of weed control rather than moisture conservation from a dust mulch. Both Funchess (1929) and Richey (1937) have given similar lists of advances made in field crop science. (b) Soils A quarter-century ago, physical -chemical analyses of soil without other data were frequently erroneous as a basis for the estimation of the agricultural value of soils. In recent years some of the more valuable contributions have been as follows: (1) Use of mineral fertilizers to Improve soil fertility; (2) ionic exchange in soil colloids that led to an explanation of alkali-soil formation; (3) soil classification and soil survey; (k) soil acidity in its relation to plant growth; (5) soil colloids and their properties; (6) soil bacteria and other organisms end their influence on soil fertility; and (7) soil erosion and its control. That soil productivity may be maintained for a long period of time by the use of sound rotation and manurial prac- tices has been shown by the Morrow plots at Illinois. The results for 39 years have been summarized by De Turk, et al . (1927).- VII . Value of Early Agronomic E xperiments '• ... Some of the investigational work in agronomy "before 1910 was of little value due to errors in the experiments , many of which were great enough to vitiate the conclusions. Contradictions were common. As Piper and Stevenson (1910) point out, results were sometimes suppressed "because they failed to coincide with current theory. "In short, all scientific evils necessarily associated with experimental methods are too evident in the field work in agronomy." The same type of criticism applies to other agri- cultural branches at that time. There were many reasons for this situation. The "guess method" was widely used by the- old school of experimenters for the accumula- tion of information. They usually lacked facte, lacked a broad outlook, were limited in their experiences and, in many cases, had wide differences in viewpoints. Some of the short coinings have been due to. pressure for information with the result that the conclusions were often based on too few data. Other weaknesses were due to the view- point in some quarters that empirical facts were preferable to fundamental informa- tion from a practical standpoint. While many of the early experiments would be inacceptable today in the light of modern experimental standards, they nevertheless contributed to progress. Some of them were as well conducted as those of today. Early agricultural practices were determined quite as much by opinion as by experiment. It would have been a poor ex- periment indeed we're it to be less reliable than unsupported opinion. These early experiments must be evaluated in relation to the knowledge of the time as well as their effects on agricultural science and practice. For example., the early Sotham- sted investigations on the source of nitrogen in plants finally led to a solution of the problem even tho the field experiments conducted in connection with them would be considered today as inadequate. Many of the weaknesses in early experiments have been met gradually through (l) wider application of modern statistical methods, (2) replication of plots or treatments, and (3) wider use of the inductive or scientific method in which general principles are sought rather than empirical facts. VIII . P resent Trends in Agronomic R esea rch Some very definite trends are apparent in modern agronomic research, among them being the emphasis on design of experiments, long-time projects, and regional coordination of research. (a) De sign of E xp eriment s In recent years, a great deal of stress has been placed on the design of experiments. The field lay-out and the method of analysis of the data are coordinat- ed so as to lead to more efficient experimental results. The emphasis on design has been made by the Pothamsted workers. Design focuses attention on the objects of an experiment that can be attained in no other way. This trend promises to reduce the number of situations where an experiment is conducted and data collected before a method of analysis is conceived. (b) Long-time Pr ojects Another definite trend is toward the long-time .project. According to Henry Wallace (1936), "The solution of problems related to crop production is a matter of years. The improvement of plants by breeding must extend through many generations. Varieties must be compared in a number of different kinds of seasons for correct evaluation. The same is true of tests of fertilizers, spraying practices, and cul- tural methods. To be productive, a program of plant research accordingly must be stable, with a concentration of effort until a given problem is solved or its solu- tion found impractical for the time being." (c) Regional Coordination of Res earch The regional coordination of research work to reduce duplication of effort is "being regarded more and more as essential. It has "been stressed "by Call ( 193*0 > Jarvis (1931)* and others. Agronomic research "began as isolated hits of investiga- tion to solve local problems. Cooperation and coordination was developed later to reduce wasteful duplication. It also makes possible a comprehensive attack on intri- cate problems, as well as the elimination of artificial boundaries. Such effort en- courages personal contacts and exchanges of ideas between different investigators. Various bureaus of the U. S. Department of Agriculture took the leadership in region- al coordination. The most formal efforts on regional coordination are in the north- eastern states on pasture investigations and soil organic matter studies. The limi- tation of initiative and individuality of investigators lias sometimes been feared as a result of regional coordination, but for the most part appears to be unfounded. Referenc es 1. Allen, E. W. Initiating and Executing Agronomic Research. Jour. Am. Soc. Agron., 22 : 3^1 . 1930. 2. Ball, C. R. Some Problems in Agronomy. Jo. Am. Soc. Agron., 3:337-3^7. 1.916. 3. Call, L. E. Increasing the Efficiency of Agronomy. Jour. Am. Soc. Agron., Ik: 329-339. 1922. k. Regional Coordination of Agronomic Research from the Standpoint of the Station Director. Jour. Amer. Soc. Agron., 26:81-38. 193^. 5. Carleton, M. A. Development and Proper Status of Agronomy. Proc. Am. Soc. Agron., 1:17-21+ . 1907, 6. DeTurk.. E. E>, Bauer, F. C, and Smith, L. H. Lessons from the Morrow plots. Ill.'Agr. Exp. Sba. Bui. 300. I927 . 7« Funchess, M. J. Some Outstanding Results of Agronomic Research. Jo. Am. Soc. Agron., 21:1117. 1929. o. Hall, A. D. An Account of the Rothamsted Experiments (Preface and Introduction) 1905. 9» Fertilizers and Manures, pp. I-38. 1928. 10. Haskell, S. B. Agricultural Research in its Service to American Industry. Jo. Am. Soc. Agron., l^tk^-kQl. 1923. 11. Howard, A. Crop Production in India, pp. I85-I95. 1924, 12. Jarvis, T. D. The Fundamentals of an Agricultural Research Program. Sci. Agr., 26:81-38. 193^. 13. Lipman, J. G. A Quarter Century of Progress in Soil Science. Jo. Am. Soc. Agron., 25:9-25. 1933. ' Ik. Office of Experiment Stations. Legislation and Rulings Affecting Experiment Stations, Miscel. Publ. 202. 193**. 15. Pearson, Karl. The Grammar of Science, P. 30. 1911. 16. Piper, C. V., and Stevenson, W. H. Standardization of Field Experimental Methods in Agronomy. Proc. Am. Soc. Agron., 2:70-76. 1910. 17. Richey, F. D. Why Plant Research. Jour. Am. Soc. Agron., 29:969-977. 1937- 18. Thorae, C. E. Essentials of Successful Field Experimentation. Ohio Agr. Exp. Sta. Cir. 96. I909 . 19« True, A. C. A History of Agricultural Experimentation and Research in the United States. Miscel. Publ. 251, 1937. 20. Warburton, C. W. A Quarter Century of Progress in the Development of Plant Science. Jo. Am. Soc. Agron., 25:25-36. 1933 . 21. Wheeler, H. J. The Status and Future of the American Agronomist. Proc. Am. Soc. Agron., 3:31-39. 1911. b Questions for Discussion 1. What conditions led to the subdivision of the agricultural field? 2. What are the functions of agronomy as a science? Why? 3- Who founded the first experiment station? What results were obtained? k. When was the Rothamsted Experimental Station established? Where? By whom? Why? 5. What has Rothamsted contributed to early agricultural science? 6. Where was the first American experiment station established? When? Upon what did It work? 7. Give some reasons to justify agricultural experimentation as a public duty. 3. Why Is agriculture in America a national concern second to none? 9. Give several reasons why a farmer is generally unable to do experimental work. 10. Name the acts of Congress that contributed to the agricultural experiment sta- tions, together with their dates of passage. 11. What special requirements is necessary for the expenditure of funds under the Adams Act? Bankhead- Jones Act? Hatch Act? 12. What kind of basic training Is necessary for agronomic research? 13. What would you consider as some of the most important attributes of a successful investigator? Ik, Discuss briefly the following characteristics in relation to research: (l) imagination, (2) classification, {')) discrimination, {h) accuracy, and ( 5 ) t horoughne s s . 13- Why has the project system been useful in research? lo. Name five contributions to crop knowledge made by experiment stations. Five con- tributions to soil science. 17. What are some reasons for the early contradictions in agronomic science? 18. What was the value of early agronomic experiments? What were some of their weak- nesses? 19- Name and discuss three trends in agronomic research at the present time. CHAPTER II HISTORY OF BASIC PLANT SCIENCES : . / . " I. Early History of Basic Sc i ences Agronomy as a science "began with the establishment of the first experiment station by Jean Baptiste Boussingault in 183k, although many empirical facts were known be- fore that time. '.• (a) Early Science Science, in general, dates from Aristotle who was the founder of zoologj and the forerunner of evolution. . A a one of the founders of the inductive method he first conceived the idea of organized research. In fact, his principles might well be observed at the present time. After Aristotle, little progress was made for 2,000 years. Among his theories was the one that the universe was composed of four ele- ments: air, earth, fire, and water. This was accepted for centuries because the habit was to assume some man as an authority rather than to investigate. At the beginning of the 17th century, Newton and Galilee began to base conclusions on facts. Francis Bacon wrote books which emphasized that theories should be based on facts rather than on authorities. (b) Reasons for Slow Progr ess in Scie nce Progress in agricultural science has had to wait on discoveries in the basic sciences of physics and chemistry. There are many reasons for the slow development of science in past ages, (l) Slavery was the general rule, with the result that there was little stimulus to improve. (2) Experimenters lacked accurate instrument's for measurement. (3) The mildness of the climate in the early civilized ceuntries restricted industry, (k) Mathematical science was restricted. (5) The scientific method developed by Aristotle was seldom used. Instead, it was the habit to assume a general law. (6) Superstition and interference by the clergy discouraged experi- mentation. II. Development of Agricult ural Science There was little activity in the sciences related to agriculture before _l800. Funda- mental discoveries at the close of the l3th century, together with the appearance of several treatises on agriculture, started .rap id development. 'Sir Humphrey Davy (1813) published a book entitled "Essentials of Agricultural Chemistry" in which he brought together many known facts. A vcn Thaer (l8l0) published a book on "Reasons for Agriculture" in which he emphasized the value of humus in the soil, from which he believed' plants gained their carbon. In 18U0, Justus von Liebig published hir: book or organic chemistry in relation to agriculture, in which he advocated that the soil need only be supplied with minerals.. -This latter work struck the scientific world as a thunderbolt. It has had a great deal of influence on modern agricultural research. The establishment of the Rothamsted Experimental Station in I838 also ; reflected the interest JLh agricultural science. The discoveries important to agriculture since I85O have been: (1) The theory of evolution, (2) the discovery of anaerobic bacteria, (3) the source of nitrogen in plants thru the aid of bacteria, (h) Mendel's laws of heredity; (5) the chromosome theory of heredity as a physical basis for inheritance; and (6) the discovery of vitamins. . . •9- 10 A — Plant Nutr ition III. Early Plant Discoveries Very little information -was gathered on plant science from the time of the Greeks up to the Renaissance, (l) Theophrastus : Published a "book on plants entitled "Enquiry into Plants." He classified plants into herbs, shrubs, and trees. Theophrastus al- so distinguished bulbs, tubers, and rhizomes from true roots. Plant adaptation was discussed. (2) Al Farbi: Discovered respiration in plants about 950 A.D. (3) Johann van He lmont: This worker, who lived in the 17th centur^r, believed that water was transformed into plant material . He placed 200 lbs . of soil in a receptacle and grew a willow in it. Nothing was added but water. At the end of five years, he found the willow weighed 169 lbs. and 3 oz., while the original soil lost only 2 02. from its original weight. He concluded that the growth came from the water alone, but failed to consider the air. (k) Jethro Tull : Believed that earth was the true food of plants and that they absorbed soil particles. Therefore, he believed it necessary to finely pulverize the soil through cultivation. Tull developed cultural Implements and devised a system to plant crops in rows. TV. Source of Nitrogen in Plants The period from 18^0 to 1885 was taken up largely with the Rothamsted-Liebig contro- versy on the source of nitrogen in plants. ( a ) Earlier Work on Nitroge n The element nitrogen was discovered in 1772. Joseph Priestly, followed by Jans Ingen-Hausz, settled the fundamental fact that green plants in sunlight decom- pose the carbon dioxide from the atmosphere, set oxygen free, and retain the carbon. This source of carbon accounts for the bulk of dry matter in plants. From his work in 1804, Theodore De Saussure concluded that plants were unable to assimilate free atmospheric nitrogen, but obtained it from the .nitrogen compounds in the soil. The pot experiments carried out by J. B. Bouesingault, who began his investigations in l8o4, indicated that plants draw their nitrogen entirely from the soil or manure. (b) Liebig-Rothamated Controv ersy Justus von Liebig in lQkd maintained that green plants, by the aid of sun- light, derive their total substance from carbonic acid, water, and ammonia present in the atmosphere, and from simple inorganic salts in the soil which are afterwards found in the ash when the plant is burned. Liebig believed combined nitrogen in the soil to be unnecessary in plant nutrition. This view was disputed by Lawes and Gil- bert who began elaborate experiments at Rothamsted in 1857. They grew plants under glass shades, ammonia from the air being kept out. The earth, pots, manures, etc., employed in the experiment were burned to sterilize them. Carbon dioxide was intro- duced as required. Lawes and Gilbert made their trials both without manure and with ammonium sulfate . Their work was done so carefully that the possibility of nitrogen fixation by plants was excluded. While they concluded that plants require combined nitrogen from the soil, they were unable to account for the gain in nitrogen in some plants under field conditions. They. found actual gains in nitrogen when leguminous plants were grown in the field, which was in agreement with the long experiences of practical farmers. (c) Final Experiments on Nitrogen Relations The final experiment on nitrogen assimilation by plants was performed by H. Helriegel and H. Wilfarth who found the symbiotic relationship between bacteria and legumes. When he grew plants in sand, Helriegel (1.886) found that the Gramineae, drucifereae, Chenopodiaceae, etc., grew almost proportionally to the combined nitro- gen supplied. When absent, nitrogen starvation took place as soon as the nitrogen from the seed was exhausted. In legumes, he found that the plants were able to 11 recover and. begin luxurious growth. The. roots always had nodules on them in such instances. However, legumes grown in sterile sand behaved the same as other plants, but recovery could be brought about when a watery soil extract was added to them. Renewed growth and assimilation of nitrogen was found to depend upon the production of nodules on the roots. VFilfarth (1887) found bacteria in the nodules and settled the point that bacteria are associated with nitrogen fixation. Later, these results were confirmed at Rothamsted and final proof was obtained on the role of nitrogen in plants. As Hall (3.905) recounts, the "very vigor" of the Rothamsted laboratory pre- vented fixation of nitrogen by the. exclusion of all possibility of inoculation. The legumes as a class were found to be an exception to the contention that plants could use only combined nitrogen from the soil. Both schools were partly right. B -- Evolution and Genetics V. Early Work in Genetic s Although many facts of inheritance were known previously, Genetics has been regarded as a science only since 1900. At that time, the work of Gregor Mendel, originally published in I865, was brought to light. Early work is reviewed by Roberts (1919; 1936), Zirkle (1932, 1935), and by Cook (1937). ( a) Sex in Plants The Bisexual nature of the date palm was recognized by the early Babylonians and Assyrians 5000 years ago. The ancients ascribed many monstrosities to hybridiza- tion. Many theories of heredity were in vogue, but no experimental data. Theophras- tus and Pliny discussed sex in plants, Primitive men made improvements in crops, rice and maize being good examples. About lbOO a new spirit of scientific skepticism began to be manifest. Many of the cumulative absurdities and theories were being put to experiment. The in- creased interest in biology culminated in the publication of the famous letter by Camerarius in 169^ on sex in plants. He gave convincing evidence that plants are sexual organisms. Sex in plants was demonstrated by actual experiments with spinach, hemp, and maize. ., .,;-., (b) Hybridization of Plants This work was followed by the production of the first artificial plant hybrid by Thomas Fair child in England, a short time before 171?. In the next 50 years there occurred a veritable wave of hybridizing. Crosses between more than a dozen genera were made by several investigators. This period culminated in the publication of the work of J. G. Koelreuter (I76I-66) in which he reported the results of 136 experiment* on artificial hybridization. In 1793* C. K. Sprengel observed cross pollination of plants by insect 3. However, Zirkle (1932) calls attention to the fact that insect pollination was observed by an American named Miller at a much earlier date. From I760 to 1859 there followed many experiments on plant hybridization in attempts to determine the nature cf inheritance. In 1822, John Go 3 s (England) reported but failed to interpret dominance and. recessiveness, and segregation in peas. A. Sageret (France) in 1826 classified contrasting characters in pairs, using muskme Ions and cantaloupes. K. F. von Gaertner reported in 1835 on hybridizations made with 107 plant species. He noted plant vigor and the, uniformity of the first generation after a cross. In 1863, C. V. Naudin (France) published a memoir on hybridization in which he almost discovered the laws of inheritance. VI . The Theory o f Evolution . . . ' \ : . -'• . ■ ' • -'••• " v .-•' The theory of organic evolution is one' of the most profound theories expostulated in the past 300 or hOO years. It was brought to fruition in the publication of the. "Origin of Species" \)y Charles Darwin in I859. Hybrids are discussed extensively, 12 but its contribution to genetics was mostly indirect. It marked the "beginning of the modern experimental approach to biological problems. (a) Evolution before Darwin When Darwin published the "Origin of Species" spontaneous generation and special creation were the current theories. A great majority of naturalists believed that species were immutable productions specially created. Up to this time, empiri- cal rather than scientific improvement had been made in plants. Darwin did not originate the evolution theory; he merely furnished evidence for its substantiation. Aristotle had expressed the central idea of evolution. Modern philosophy from Fran- cis Bacon onward shows definiteness in its grasp and conception. Erasmus Darwin, grandfather of Charles Darwin, had a theory similar to that propounded in the "Origin of Species." J'.B.P. de Lamarck, in his "Philosophic Zoologique" published in 1809, made the first attempt to produce a comprehensive theory of evolution. He added the idea of "use and disuse." Lamarck believed in the inheritance of acquired characters .and attributed some influences to direct physical factors. In other words, all the principal factors of evolution had been worked out before the time of Darwin with the possible exception of "survival of the fittest" which he obtained from a book by Malthus on population. ... (b) The Work of Cha r les Dar win Darwin made an extended trip around the world in the Beagle, collecting voluminous facts and making extensive observations , in support of his theory. He is given credit for the evolution theory because he was the first to gather facts. He attempted to show how and why new species arose. (1) Theor y of Natural S election: Present organic forms are believed to have evolved from more simple forms in past ages. The theory was founded on these facts: (a) Variations between individuals are universally present; (b) a struggle for existence takes place between individuals; (c) through natural selection these individuals with the most favorable variations survive; and (d) heredity tends to perpetuate the favorable variations from natural selection. (2) Reasons for Success : Darwin was successful because of his thorough- ness, accuracy, hard work, honesty, ability to see, and because he was a stickler for details. He showed by example that disinterestedness, modesty, and absolute fair- ness are important attributes of character in intellectual work. Darwin (1859) him- self states that his success was due to a love of science, unbound patience for long reflection on a subject, industry in the collection and observation of facts, as well as a fair share of invention and common sense. VII. The Cell in Relation to Inheritance Independent progress was being made in other fields that were to have a profound in- fluence on genetics after 1900. A. von Leeuwenlioek (Holland) discovered the micro- scope and saw mammalian germ cells in 1677- The cell theory was propounded in 1838-39 by M. J. Schleiden and T. Schwann (Germany). This was the first generalized statement that all organisms are made up of ceils--one of the greatest generaliza- tions of experimental biology. The union of sperm and egg cells, i.e., fertilization, was first seen in seaweed by G. Thuret (France) in 184-9 • A year later he showed that the egg would not develop without fertilization. The chromosomes were described in 1875 by E. Strassburger (Germany). During the same year Oscar Hertwig (Germany) proved that fertilization consists of the union of two parental nuclei contained in the sperm and ovum. .W. F lemming (Germany) in 1879-82 describes the longitudinal splitting of the chromosomes, and later observed (1884-85) that the halves of split chromosomes went to opposite poles. Th. Boveri (Germany) in 1887-88 verified the earlier prediction of A. Vfeismann that reduction in the ohrojir- Eome stakes place. In I898, S. G. Wavashin (Russia) discovered double fertilization in higher plants. Thus, the physical mechanism of inheritance was pretty well worked out by the time that the work of Mendel was discovered. 13 VIII. The- Lavs of Inheritance . . The turn of the century proved to he an epochal year in the experimental study of heredity. The work of Gregor Mendel (Austria), an August inian monk, on Inheritance in peas was rediscovered in 1900 "by Hugo De Vr5.es> C.F.J.E. Correns, and E. von Techermak. The work had "been published originally in 1866. (a) Discovery of Principles of Heredity Mendel made crosses of peas and observed carefully the resemblances and differences among different races. He began his work in 1857* The principles of heredity which he put forth were as follows: (l) single heredity units, (2) allelo- morphism or contrasted pairs, (3) dominance and recessiveness, (h) segregation, and (5) combination. The last two are generally recognized as the distinct contributions of Mendel . (b) Methods used by Mendel There are several reasons for the success of Mendel . His work differed from that of his predecessors in several respects. (1) He .-made actual counts and kept records of each generation. (2) One pair of factors was studied at a time. (3) His material was carefully studied and selected. (K) He guarded against errors in acci- dental crosses. (5) He worked with large numbers. (6) The crosses were studied for seven generations. Roberts (1929) comments as follows on the work of Mendel: "Nothing in any wise approaching this masterpiece of investigation had ever appeared in the field of hybridization. For far-reaching end searching analysis, for clear thinking-out of- the fundamental principles involved, and for deliberate, painstaking, and accurate following-up of elaborate details, no single piece of investigation in their field before his time will at all compare with it, especially when we consider the absolute absence of precedent and initiative for tho work.": IX. Modern Developments in Genet ic3 The universality of Mendelian principles was verified' in plants, animals, and. man within three years. In 1902, Hugo De Vries advanced the mutation theory to explain sudden changes in plants that breed true, but which could not be accounted for by Mendelian inheritance. He found sudden changes in the evening primrose to breed true in certain cases. These mutations were believed to furnish the basis for evo- lution. This was soon followed by the pure-line concept, i.e., variations in the progeny of .a single plant of a. self -fertilized species .arc not due to inheritance. This was first put forth by W. L, Jchannsen (Denmark) in I905. H. TTilsson-Ehle (Sweden) advanced the multiple -fact or hypothesis in I908. The chromosome theory of heredity was announced by T . H. Morgan in 1910. His gene theory included the prin- ciple of linkage of genes resident on the same chromosome. This brilliant hypothe- sis has been upheld in many experiments. Much recent work has been concerned with polyploidy, the mechanism of crossing-over, and sterility. The principles of genetics have enabled plant breeders to make definite contributions to improved varieties. Many new varieties are now grown on farms that have been made possible through application of the laws of inheritance. Many varieties are "made to order" to meet particular conditions. — Other Basic Sc iences X. Development of Bacteriology .. Great advances were made in the field of bacteriology between 1.360 and i860, it be- ing definitely established that bacteria bring about putrefaction, decomposition, and other changes. The work of Louis Pasteur dominated the field during this period, Ik (a) Pasteur and his Work Pasteur discovered anaerobic "bacteria. Fermentation was commonly thought to he the result of a chemical change, hut Pasteur proved it to he due to anaerobic bacteria.. This wrecked the theory of spontaneous generation of life. Pasteur showed that the presence of bacteria could always be traced to the entrance of germs from the outside, or to growth already present. Other contributions of Pasteur included the discovery of the causes of many bacterial diseases, and the development of methods of immunization. Pasteur had several attributes that led to his success: (1) He established truth by experiment; (2) He was discerning with regard to the problem on which he worked; and (3) he worked on one problem at a time. ( b ) Other Di sco veries in Bacte riology Many further developments in bacteriology depended upon the improvement of the microscope and the perfection of various technics. The oil immersion lense was developed about i860. The agar -plate method for the study of growing colonies of bacteria was introduced by Robert Koch in l88l. The transformation of ammonia to nitrates was demonstrated by T. Schloesing and A,. Muntz in I877, but it remained for S. Winogradsky to isolate the organisms concerned. That nodules are formed on legumes as the result of inoculation with microorganisms was demonstrated by H. Helriegel and H. Wilfarth in 1886. M. W. Beijernick isolated non-symbiotic bacteria, i.e., the Azotobacter, in 1901. Among other contributions of ■ bacteriology were sterilization technics, the classification of bacteria on a physiological basis (started by Ferdinand Cohn in I872) the study of diseases due to filterable viruses, and studies in the nature of bacteriophagy . XI. P lant Patholog y Like all natural sciences, plant pathology had its start with the dawn of civiliza- tion. The Hebrews mentioned plant diseases in the Bible, but only gave descriptions and mentioned damage. Little was known about plant diseases until the modern era which began about I85O. One of the greatest early workers was Anton de Bary (German) who proved the parasi- tism of Fungi in I853 . A little later (lQ6k) he proved heteroecism in rusts as illustrated by the relation of the aecidium on the barberry to the red and black rust stages on wheat . That bacteria may cause plant diseases was first proved by Thomas Burrill in 1879-81 . He showed that a definite species, Bacillus amylovoris, was the causal agent of fire blight. The use of Bordeaux mixture as a fungicide was started in France in 1886. Since that time, many other fungicides have been used in plant disease control, the latest being the organic mercury compounds. Biologic strains in rusts were discovered in I89? 4 - by J. Eriksson (Sweden), while races within a variety of rust were demonstrated by E. C. Stakman and his coworkers in I916. Another important discovery was made by J. H. . Craigie in 1927 when he dis- covered sexuality in the rusts. A rapid increase in the knowledge of so-called virus diseases of plants has taken place since the first proof of tobacco mosadc as an infectious disease in 1388. The role of insects in the transmission of the virus or active principle was soon recog- nized. Recently, W. M. Stanley (1937) bas advanced strong evidence that the tobacco mosaic virus is due to a high molecular weight crystalline protein. A great deal of attention is now being given to the production of disease-resistant and immune varieties of crop plants through the application of genetic methods. 15 References 1. Cook,. B. A Chronology of Genetics. Yearbook of Agriculture, U.S.D.A. pp. 1457-1477. 1937. 2. Dampier-Whetham, W.C.D. A History of Science, pp. 33-40, l4l-l47, IJQ-I69, and 285-287. I93I. •■ 3. Darwin, Charles. Origin of Species. 1859. •' / 4. Hall, A. D. The Book of the Bothamsted Experiments, preface and introduction, pp. 1-14. I905. 5. Hall, A. D. Fertilizers and Manures, pp. I-38. 1928. 6. Hayes, H. K., and Garber, B. J. Breeding Crop Plants, pp. l-l4. 1927» 7. Heald, F. D. Manual of Plant Diseases, pp. 17-20. 1933- 8. Boberts, H. F. The Founders of the Art of Plant Breeding, Jour. Her., 10:99-106, 147-152, 229-239, and 267-275. 1919. 9. Boberts, H. F. Plant Hybridization Before Mendel . 1929. 10. Bus sell, E.J. Soil Conditions and Plant Growth, pp. 1-31. 1932. 11. Stanley, W. M. Crystalline Tobacco Mosaic Virus Protein. Am. Jour. Bot., 24:59-68. 1937. 12. Theophrastus . Enquiry into Plants, Vol. I and II (Translated by Arthur Hort) . 1926. 13. Vallery-Badot, B. The Life of Pasteur. I928. 14. Weir, V. W. Soil Science, pp. 3-26. 1936. 15. Whetzel, H. H. An Outline of the History of Phytopathology. 16. Zirkle, C. Some Forgotten Records of Hybridization and Sex in Plants. Jour. Her., 23:433-448. 1932 . 17. Zirkle, C. The Beginnings of Plant Hybridization. 1935- Questions for Discussion 1. Who is considered the "Father of Science"? Why? 2. What were the contributions of Galileo, Bacon and Newton to science? 3- Why did science develop slowly previous to the l6th century? 4. Name several discoveries important to agriculture since 1850. 5. Who was Theophrastus, and what did he do? 6. What were the views of these men on plant nutrition: Von Helmont, Jethro Tull, Thaer, Liebig? 7. What did deSaussure contribute to agricultural research? Sir Humphrey Davy? Boussingault? 8. What facts made the source of nitrogen in plants so important a problem during the 19th century? 9. Mention three theories that were proposed to account for the supposed extraction of nitrogen by plants from the ail*. 10. Describe the experiments at Bothamsted conducted to determine whether or not plants secure nitrogen from the air. What was the result of these experiments? 11. By whom, how, and when was the source of nitrogen of legumes discovered? 12. What important lessons are illustrated by the investigations relating to the source of nitrogen in plants? 13- What did these men contribute to early plant science: Oamerarius, Kolreuter, Sprengel, and Naudin? 14. Who originated the theory of evolution? Why was it not accepted at that time? 15. What was the status of plant and animal improvement at the time of publication of the "Origin of Species"? 16. What did Darwin contribute to the theory of evolution? Why is he usually given credit for it? 17. What is the theory of natural selection? 16 18. Describe the work of Mendel and tell why he was successful. 19. What is the mutation theory? Chromosome theory of heredity? Multiple factor hypothesis? 20. What was the prevailing belief in spontaneous generation of life when Pasteur began investigating the subject? 21. What are the principal contributions of Pasteur? 22. Name several advances made in bacteriology since the time of Pasteur? 23. Name 5 important discoveries in plant pathology. l/ CHAPTER III LOGIC HJ EXPERIMEOTATION I. Scope of Science Science is systematized knowledge. The function of science is the classification of observations and the recognition of their sequence and relative significance. Its scope is to ascertain truth in every "branch of knowledge., Sound logic is just as fundamental to good science as accurate data. The thought process most important in science is induction, i.e., reasoning from the particular to the general. General- izations may lead to laws and principles ahout natural phenomena. II. Science among the Ancients "Primitive peoples lived through thousands of years of myth and magic, while science was rising out of slow and unconscious observations of natural events, "Weir (1936) explains. Aristotle (38^-322 B.C.), one of the first to stress science, taught that it can he developed only through reason. He sot up a logical scheme, called the syllogism, which severely limited deductions made from generalizations. Jevons (I87O) describes the syllogism as follows: "In a syllogism we so unite in thought two premises or propositions put forward, that we are enabled to draw from them or infer, by means of the middle term they contain, a third proposition called the con- clusion." An example is as follows: "All living plants absorb water; (major premise) A tree is a living plant: (minor premise) Therefore, a tree absorbs water." (conclusion) The syllogism has been rejected for a long time because it lead3 to no new knowledge. It involves a deductive process, the conclusions being only as accurate as the pre- mises upon which they are based. New generalizations can only be reached through induction, a process which affords a means to attack the premises themselves. A -- Methods and Types of Research III. Research In the broad sense, the collection and analysis of data is research. However, there are different degrees of research value. Black, et . al . (1923) state that the mere accumulation of facts, computation of averages, or census -taking is not research. Fact -gathering alone is a mechanical procedure unless tied up with analysis. More- over, projects designed to serve purely local or temporary' needs, without some con- tribution to fundamental principles, ordinarily can have but little scientific value, General laws or principles are sought in research of the highest order. There are two methods of research, the empirical and the inductive. Black, et . al . (1928) state that "the essential difference between the two is that the first one accepts superficial relationships without inquiry as to antecedents, whereas the second one pursues antecedents a stage or two at least." An antecedent is a condition or cir- cumstance that exists before an event or phenomenon. IV. Inductive or Scientific Method The process of induction is of special importance in experimental science in which general laws are established from particular phenomena. The inductive method is the scientific spirit of the day. -17- 13 (a) Explanation of Induction In induction one proceeds from less general, or even from individual facts, to more general propositions, truths, or laws of nature. In other vords, it is the formulation of a principle from facts. Induction was the method of Francis Bacon, •who held that general laws could he established with complete certainty by almost mechanical processes. Bacon advised that one begin by collecting facts, classifying them according to their agreement and difference. It is then possible to induce from their differences and similarities the possible reasons for the relationships exhi- bited and, from them, arrive at laws of greater and greater generality. Thus, the inductive method attempts to answer the question "why". A knowledge of causes enables the scientist to forecast with greater and greater assurance because, when he knows what is behind a set of relationships, he is in a much better position to know whether or not they will occur again. On the other hand, deduction is the in- ference from the general to the particular, i.e., some truth may allow individual facts to be sub-summed under it. Induction and deduction are used together in ex- perimental work. For instance, a premature induction may be made to account for a phenomenon. A hypothesis is set up that may or may not be faulty. Next, an experi- ment is designed, a purely deductive process, to test this hypothesis. The investi- gator determines the particular instances he- may create and observe by experiment to use as a basis of a re -generalization to establish. the original hypothesis. ( b ) Observation The first requisite of induction is experience to furnish the facts. Such experience may be obtained by observation or experiment. Jevons (1870) makes this statement; "To observe is merely to notice events and changes which are produced in the ordinary course of nature, without being able, or at least attempting, to control or vary these changes." The botanist usually employs mere observation when he ex- amines plants as they are met with in their natural condition. Progress of knowledge by mere observation has been slow, uncertain, and irregular in comparison with that attained in the controlled experiment. However, to observe well is an art that is extremely advantageous in the pursuit of the natural sciences. One should make accurate discrimination between what he really docs observe and what he infers from the facts observed. The investigator should be ' : uninfluenced by any prejudice or theory in correctly recording the facts observed and allowing to them their proper weight", according to Jevons (1870). ( c ) E xperiment ation In the experimental method in its pure form, a special hypothetical plan be- comes the basis of conclusions. The investigator varies at will the combinations of things and circumstances, and then observes the result. Fisher . (193 7 ) describes experimental observations as "only experience carefully planned, in advance, and de- signed to form a secure basis of new knowledge; that is,, they are systematically re- lated to the body of knowledge already acquired, and the results are deliberately observed, and put on record accurately." In actual practice, the effect of differ- ent factors is determined by holding all conditions constant or uniform except the one or ones whose "effects" are to be measured, a definite amount of change in this condition being balanced against a definite amount of change in the result. Black, et al, (l9?-8) state that it is sometimes only the effect of the presence or absence of a condition that, is noted. The method is qualitatively experimental instead of quantitatively in such cases. In many cases, it is impossible to hold all conditions but one constant or even uniform. So statistical analysis is combined with the ex- perimental design to measure variation where it cannot be controlled. This is the practice in many agronomy experiments. For instance, when two or more wheat varie- ties are compared for yield, they are planted in the same field, at the same time, and at the same rate. Moreover, they are harvested at the. same time, threshed by the same machine, and the seed weighed on the same balance. The conditions arc thus uniform for the varieties rather than constant. The importance of the experiment is 19 well summarized by Jevons: "It is obvious that experiment is the most potent and direct mode of obtaining facts where it can be applied. We might have to wait years or centuries to meet accidentally with facts which we can readily produce at any moment in a laboratory . . . ." (d) Essentials of Good Scientific Me thod The essentials in sound experimental method may be briefly summarized as follows : 1. The formulation of a trial hypothesis, v 2. A careful and logical analysis of the problem generated by the hypothesis . 3. Use of the deductive method to design how to effect a solution of the problem. This involves a detailed outline of the experi- ment with costs, equipment, methods, etc. The factors should be expressed in quantitative terms when possible. h. Control of the personal equation. 5. Rigorous and exact experimental' procedure with the collection of data pertinent to the subject. 6. Sound and logical reasoning as to how the conclusions bear on the trial hypothesis and in the formulation of generalizations. A statement of the exact conclusions warranted from the cases exam- ined should be made in accurate terms. T. A complete and careful report of data and methods of analysis so that others can check them. V. T he "Smpirical Method "When a law of nature is ascertained purely by induction from certain observations or experiments, and has no other guarantee for its truth) it is said to be an empirical law," according to Jevons (I87O). Thus, knowledge is empirical when one merely knows the nature of phenomena without being able to explain the facts. It only answers the question "how". Formerly, the empirical method represented knowledge secured by trial, but today it means the haphazard ''cut and try" method. A person who learns certain facts through repeated observations may know no reason for their being true, i.e., he cannot bring them into harmony with any other scientific facts. The method is valuable in spite of the criticisms against it. Empirical methods are most likely to be used when a science is new. Fact 3 must be gathered before a notion of reasons can be formulated . The older crop rotation experiments were empirical. Recommenda- tions are based on the results, i.e., certain rotation systems result in higher crop yields. Crop variety tests are generally empirical, since the chief concern is to determine what variety yields the highest.* Fully one-half the agronomic experiments in this country are haphazard in the nature of 'their relationship to the body of known knowledge in a given line. Too often they are not related to past experiments. (See Allen, 1930) . VI. General Types of Agr onomic Experiments Agronomic experiments can be divided into field and laboratory or. greenhouse experi- ments. Questionaires and surveys are occasionally used to secure preliminary infor- mation. - *Note: In recent years, variety tests may involve more than empiricism. Crosses are often made to combine high yield with certain desirable quality factors or disease resistance. The yield trial determines whether or not the result, has been accom- plished. 20 (a) The Field Exper iment The field experiment involves the use of small plots, usually between l/lO and l/lOOO - acre in size. The treatments are replicated, i.e., repeated on the experimental area in tests designed to remove the error due to soil heterogeneity. To make other conditions as uniform as possible, the varieties or treatments in the experiment are treated as nearly the same as possible except for the factor or fac- tors under study. The field experiment has a wide application where yield is used as a criterion to measure treatment effect. Field experiments may be classified as follows: (1) Variety Test s: Such trials usually measure the yield of strains, varieties, and species. Various combinations of forage crops for hay or pasture are sometimes classified as variety tests. (2) Bate an d Date Tests : These experiments are concerned with the yield response of a variety or crop when planted at different rates or on different dates. (3) Crop d otation Tests : These trials include differ- ent series of rotations and crop sequences. jk) Cultural Studies : The time, manner, and frequency of field operations are considered in such tests. (5) Fertilizer Ex- periments : These experiments usually include tests to determine the needs of nitro- gen, phosphorus, and potassium and their best combinations. Other considerations are ways to supplement farm manures, value of cover crops and green manures, and the amounts and methods of lime application. (6) Pas ture Experiments : Field experi- ments with pastures are generally used to study methods to seed and fertilize new pastures, methods to renovate old pastures, and the influence of grazing on species survival. (See Noll, 1928) . (b) Laboratory and Greenhouse Experimen ts Laboratory and greenhouse tests are often used to supplement field trials. These tests often involve potometer and lysimeter studies as well as those based on special techniques. Pot cultures are sometimes necessary for the study of the effect of one factor by the exclusion of the others, or by their exaggeration. However, th^: sole use of laboratory experiments may result in erroneous conclusions when applied to field conditions. (S' ". Wheeler, 1907) • The use of laboratories and greenhouses is on the increase because they have the advantage of controlled conditions. Some agronomic problems adapted to such conditions are: (1) artificial rust epidemics, (2) toxic effect of sorghums on crops that follow, (3) fertilizer cultures, (h) re- sistance of winter wheat to low temperatures, and (5) moisture, temperature, and light relationship studies. Equipment for the study of hardiness in crop plants h&a been described by Peltier (I93I). Potometers are pots filled with soil in which plants are grown for experimen- tal pur-poses. To a greater' or less extent the earlier investigators assumed the accuracy of such experiments when applied to field conditions. Lysimeters are modi- fied soil tanks used to measure the magnitude of nutrient losses from the soil by leaching under various fertilizer and cropping conditions. Installation of lysi- meter equipment is expensive but permanent. The principal feature is the measure- ment of drainage water. A description of lysimeter equipment is given by Lyon and Bizzell (1918) and by the American Society of Agronomy (1933)- ( c ) Questionaires a n d Surveys Very little use is made of either the questionaire or survey in agronomic research. They are considered less desirable than the controlled experiment. The questionaire consists of a set of questions to be answered without the aid of an in- vestigator (usually mailed) . It is impossible to secure accurate answers on ques- tions that are closely defined because the chances for misinterpretation are too great. Survey data are collected with the personal aid of an enumera/tor or investi- gator. Spillman (1917) assumes that careful analyses of the methods of a large num- ber of farmers under essentially similar soil, climatic, and economic conditions, may be made to reveal the success of one person and the failure of others. Ho found that the discrepancy in the farmer's knowledge was small in large items, but increase-" 21 as the importance to him decreased. Black, et al (1928) mentions some of the weak- nesses of the survey: (1) It does not furnish snough detail for some types of prob- lems; (2) It is not accurate enough for close analysis; and (5) It does not furnish a large. enough sample for some purposes. VII . Hypo these s, Theories, and Lavs The difference between the hypothesis, the theory, and the law, is in the degree of surety or the absolute. a) Explanation of these Term s When an idea is suggested by observed phenomena it is spoken of as a hypo- lesis. It represents a desire to explain the phenomena such as, for example, the method by which plants take food from the soil. The hypothesis is important in the deductive method in that, to best this preliminary induction, it is replaced more or less completely by imagining the existence of agents which are thought adequate to produce the known effects in question. Thus, Jevons (I87O) explains, the truth of a hypothesis altogether depends upon subsequent verification. A theory is a limited and inadequate verification of a hypothesis. Examples are the theory of the gene, and the theory of evolution. A theory becomes a law when it is proved to be a fact beyond a reasonable doubt. The Mendelian laws of heredity are good examples of laws, (b) Formulation of a Hypot hesi s There are certain advantages to the hypothesis: (1) It correlates facts; (2) it forecasts other facts; and (5) it allows for discrimination between valuable and useless information. Every experiment is the result of a tentative hypothesis thought out in advance of the actual test. The hypothesis is based on the recogni- tion of coincident phenomena, or upon a familiarity with possible causes and effects. Hibben (1908) states: "Hypothesis and experiment to Charles Darwin were like a two- edged sword which he employed with rare skill and effect." The hypothesis is the precursor of the experiment which is merely an effort to solve the problem created by the hypothesis. ( c ) / Qualities of a Good H ypothesis There are several qualities that a good hypothesis should possess. These are Allows: (1) It should be plausible. (2) It must be capable of proof, i.e., it should provide a susceptible means to attack the problem created thereby- (3) It must be adequate to explain the phenomena to which it is applied, (h) It should in- volve no contradiction. (5) A simple hypothesis is preferable to a complex one. There is little use to form a hypothesis on a complex basis unless it is possible to collect the data by which it may be proved. A multiple hypothesis is made up of several ideas. Occasionally it may be desirable to formulate several hypotheses. Salmon (1928) advises an investigator to at least give consideration to all observ- able hypotheses. They are useful even though wrong because they eliminate that par- ticular idea from the problem. At any time, an investigator must be ready to aban- don a hypothesis or theory when further data prove the previous views untenable. d) Null Hypothesi s In all experimentation the null hypothesis is characteristic. The term has been applied by E. A. Fisher (1937) in his "Design of Experiments ." The ba aid- a ssumption is th at no AAffnrnnr.R .CTist.fl hfitvsfin thfi t.T>ftn.t,rnftnt..q in the experiment, i .e., they are samplfifl drown from the same general Eoj ailatiOH*- Vnr instance, in a variety test, the investigator makes the basic assumption that all varieties yield alike. He can never prove this assumption but he may disprove it in the course of experimentation. By the use of certain statistical arguments he may show a signifi- cant discrepancy from the hypothesis, i.e., the probability is that seme of the varieties do differ in yield. Fisher (1937) states: "Every experiment may be said 22 to exist only in order to give the facta a chance of disproving the null hypothesis." ( e ) Crucial Tests There may he two alternative conceptions or explanations which appear possi- ble. A crucial tost ( experiment urn crucis) is one by which two rival hypotheses can be tested so that if one is proved, the other is immediately disproved. This is the only means by which a hypothesis may be disproved. The first record of the applica- tion of the crucial test is attributed to Francis Bacon. A good example of a crucial tost was the one applied by Richey and Sprague (1931) "to tost two theories for the cause of hybrid vigor in corn which is expressed \tfien two inbred lines of reduced vigor aru crossed to give the first generation hybrid. This additional vigor, Hichoy (1927) explains, has been attributed to the physiologic stimulation hypothesis in which heterogenous germplasm within the cells provides the stimulation. The other hypothesis is that of dominant growth factors in which it is believed that the maxi- mum number of dominant growth factors are brought together in the first generation hybrid, and that linkages of favorable dominant growth factors with other less de- sirable factors prevented the recovery of individuals as vigorous as the Fj_ in sub- sequent generations. Richey and Sprague (1933- ) applied a crucial test to the two hypotheses by the collection of data on the principle of convergent improvement, i.e. backcrossing the Fi hybrid to each of the two inbred lines that went into the hybrid. It was hoped to transfer some of the favorable dominant growth factors from one of the lines and intensify them in the other. Thus, the two convergently improved lines would have less differences between them than was true of the original inbred lines. Lowered yields of the cross of the convergently improved lines, as compared to the cross of the original lines, would tend to support the physiological stimulation hypothesis. The same or higher yields from the cross of the convergently improved lines would lend support to the dominant growth factor hypothesis. The data collect- ed gave support to the latter. B — Kinds of Evidence VIII . I mpor tan ce of Evide nce It is necessary to collect facts or data before generalizations can bo made. There are different kinds of evidence, some kinds being more apt to lead to valid conclu- sions than others . However, plants are complex organic compounds with the result that it is more difficult to determine the elements of cause end effect than is or- dinarily true in the more stable physical sciences. Environment has a tremendous influence on the plant. The more that experiments or observations are repeated with the same results, the more valid the evidence becomes in the minds of all normal human beings. For example, a large number of experiments show that weed control is the principal benefit derived from cultivation. The fact that a large number of investigators have found this to be true under different conditions adds to the assurance that the results are correct. Certain methods have been developed to deal with the evidence obtained by observation or experiment which may serve as guides to those in search of general laws of nature. IX . C ause and Effect Induction consists of inferring general conclusions from particular evidence. In some cases, generalizations relate to cause and effect. An antecedent is a condition which exists before the event or phenomenon, while a consequent follows after the antecedents are put together. Jevons (I87O) makes this statement: "By the cause of an event we moan the circumstances which must have preceded in order that the event should happen. Nor is it generally possible to say that an event has one single cause and no more. There are usually many different things, conditions or clrcum- 2 5 stances necessary to the production of ah effect, -and all of them must be considered causes or necessary parts of the caused' It is certainly true that a multiplicity of causes is often involved in experiments in field crops and soils. X . Qualitative Evidenc e Qualitative evidence is that which can -"be measured only categorically. .For example, seeds either germinate or fail-far germinate. Classification "by color is a common form of qualitative .data. ( a ) Method of Agreement This method of induction is defined "by Jevons (I87O) as follows: "The sole invariable antecedent of ' a .phenomenon is probably its cause." It is necessary to collect as many instances as possible and compare together their antecedents. The cne or more antecedents which are always present when the effect follows is consider- ed the cause. For example, when rust is present on wheat, low yields are obtained. Therefore, rust causes low yiexds. This method has a serious difficulty in that the same effect in different cases may be due to different causes. (b) Method of Difference In this method, the antecedent which Is always present when the phenomenon follows, and absent when it is absent, is' the cause of the phenomenon when other conditions are held constant . In ether words, when the circumstances are all in common except one, i.e., the treatment, then the change that occurs is the effect of the treatment. This is probably the most widely ured. method In experimentation. The differences in crop yields under certain manurial treatments is an example of this method . (c) Jo int Metho d ■..■•■ In the words of Jevons (187O), the eioint method of. agreement and- difference "consists in a double application of the method of agreement,: .first to a number of instances where an effect is produced, -and'' secondly, to a number of quite different . instances where the effect is not produced." For example, the experiments of Darwin on cross and self -fertilised plants may be cited, flo placed a net around 100 heads to protect them from chance insect pollination. He als a -placed. 100' : heads of tie- same variety where they were exposed, to bees. The protected flowers failed to yield a single seed, while the unprotected 'ones produced 2';'20 seeds. Thus, cross fertili- zation by means of insect pollination was proved, to be a cause of seed set in this case . XI . Quantitative Evidence ■'-•''• : .-..-.. Every science, and "every question is first a matter of generalizations built upon qualitative evidence. The effort to more firmly substantiate such generalizations leads to the measuring of evidence quantitatively so that by. degrees, the evidence becomes more and more precisely quantitative. . (a) Method of Concomitant Variations : ' •' -" v " ■..•" ...:> . This method can be applied where the phenomena- dan be measured. ■ Every degree and quantity of the phenomenon adds new evidence in support of relationships that exist between antecedents and consequents of the phenomenon. The method which em- ploys concomitant variations to determine the .degree- of such relationship is called correlation. For instance, an experiment with wheat results in a low yield under conditions of heavy stem rust infestation, with variations to, the other extreme. 2)+ (t>) Method of Residues There may be several causes, each of which produces part of an effect, and where it may be desirable to know how much of the effect is due to each. This type of evidence consists in the analysis of a given phenomenon to determine the residue. For instance, manure contains something besides phosphorus, potash, and nitrogen as shown by the residues. In plants it has been determined that other than the so- called 10 essential elements are used because analyses of the plant ash show others to be present. The method of residues is constantly employed in chemical determina- tions . XII. Relat ion to the Original Hypothesis Some experiments fail in their objective in that there is insufficient evidence at hand to permit the investigator to draw positive conclusions. However, this evi- dence is valuable. It has been called "negative evidence," but in reality there is no such thing. Research would be much further along than it is today if all experi- ments had been reported in which the evidence was insufficient to prove the hypo- thesis that was originally set up by the investigator. Such evidence would have saved other- workers from a repetition of the work. XIII. Us e of Analogy Analogy is a form of inference in which it is reasoned that, if two (or more) things agree with one another in one or more respects, they will probably agree in still other respects. It is the simplest and most primitive form of evidence, its great weakness being the fact that the cases compared may not be parallel. Analogy may be tested by some inductive method. For example, the theory of evolution was suggested to Darwin from the "Essay on Population" by Malthus . It suggested to him that the struggle for existence is the inevitable result of the rapid increase in organic beings. The idea necessitated natural selection or "survival of the fittest." Another example might be cited in durum wheat. Durum wheat is adapted to Russia and 30 is Turkey wheat. Since Turkey wheat is adapted to the Great Plains in this coun- try, durum, wheat must be adapted to this region also. A common analogy made by agriculturists is that crops can be improved by systematic selection because liv stock breeders have succeeded in that way. Logic derived from analogy too often * -leads the inexperienced astray. C -- Methods of Discovery XIV. Work of other Investigators An investigator seldom takes up work today that is entirely new. He secures valuable help from other research workers. The cooperative attitude among the workers on the Purnell corn, projects is particularly commendable in this respect. They get together occasionally to talk over their problems freely and -to offer suggestions. They have been unusually free with their preliminary data and unpublished results so far as fellow workers are concerned. This attitude has done much to advance research in corn improvement. The seed analysts have cooperated among themselves in a similar manner. Scientific meetings result in a more or less free exchange of ideas to the benefit of all. These get-togethers are a great aid and should be attended by re- search workers . XV. Surprise s and Accidental Discoverie s An important discovery is quite often made by accident. Several examples could be -Qited. 25 (a) Lemon Juice in Grasshopper Bai t Some 25 years ago, two workers in the U. S. Department of Agriculture were testing poison bran mash as a grasshopper "bait in Kansas. These men had oranges in the lunch' they took to the field with them. While eating their oranges, some of the Juice accidentally came in contact with the bran mash. The men noticed, that the grasshoppers preferred, the mash that contained the orange juice. As a result of this discovery, Kansas came out with the lemon juice formula in 1911. ( b ) Heterothalism in Stem liust ■ Prior to 1927 > it was believed that the pycnia on the upper surface of the barberry leaf had no function. Craigie (1997) got the idea that the mycelium, pycnia, and pycniospores of some of the pustules were plus sex strains and others •ciinus sex strains. He happened upon the proof by chance. The first fly of the season appeared in the greenhouse on May 17. He watched it idly as it sipped nectar at one pustule and then at another. Professor Buller happened by and said at once: : 'fhe solution of the problem is an entomological one. Copy the fly. Take the plus pycniospores to the minus pycnia, and the minus pycniospores to the plus pycnia." Craigie followed this advice by mixing nectar from different pustules. The pycnio- spores germinated' and brought on the development of aecia and aeciospor.es, the diploid -phase. He repeated his test many times and found it to be true. Craigie proved his theory as follows: Flies were introduced, in some cages containing bar- berry plants with pustules on the leaves, while flies were excluded from other bar- berry plants. Aecia were formed in five days where the flies were present, but none were formed where the flies were excluded. XVI. Syst ematic Research One of the principal methods of discovery is through systematic research where a problem is attacked from all conceivable angles. An example is the contribution of the Hawaiian Experiment Station on chlorosis. The pineapple industry was restricted to a small area because of a discoloration of the foliage that showed it to lack chlorophyll. The investigators on this problem first exhausted the possibilities of disease, after which they analyzed the soil and. found it to contain considerable manganese. Next, the workers used this high-manganere soil on soil that would grow pineapples, and found that very little iron was taken into the plants. The "manganese was thus found to inhibit iron absorption. The plants were then sprayed with iron salts and the chlorophyll deficiency corrected. Pineapple trees are now sprayed at the rate of 50 pounds of iron Baits per acre, the yield of fruit being doubled, as a result . XVII. Other Methods of Discovercy Several other methods have resulted in significant discoveries. (1) Conflicting Results : Disagreement between different research workers in their -results often leads to new discoveries. Pasteur became engaged, in a controversy with Leibig on the spontaneous generation of life. As a result, Pasteur proved that all new life arose from forms that had already existed'. Some of the most fertile fields for new ideas are the first new hypotheses, theories, and ideas. (2) Acc ur ate Fork: Accur- ate work is necessary to secure dependable facts on which to base conclusions. More information usually results from work done carefully than from that which has been unplanned and carried out in a haphazard manner . In addition, the work of investi- gators must be accurate to withstand the close scrutiny of other workers and of general opinion. Accurate work often has led to new discoveries. (3) Analogy: A fruitful source of new ideas that sometimes leads to new discoveries is analogy. It may suggest a hypothesis from the results' secured in other experiments. ' (k) Id eas from Farmers : In agricultural research, the 'problems called to "the attention of ex- periment station workers by farmers i s an important source of 'discovery. 26 References 1. Allen, E. W. Initiating and Executing Agronomic Research. Jour. Am. Soc. Agron., 22:3^1. 1930. 2. Black, John D., et al . Research Method and Procedure in Agricultural Economics (mimeographed), pp. 1-20, 58-90, 113-126, and 298. 1-928. 3. Craigie, J. H. Discovery of the Function of the Pycnia of the Rust Fungi. Nature, 120:765-767. 1927. k. Fisher, R. A. The Design of Experiments. Oliver and Boyd. 2nd Ed. pp. 1-12, and 18-20. 1937. 5. Eibben, J. G. Logic: Deductive and Inductive, pp. lo9-l82, 222-277, and 291-329. 1908 6. Jevons, W. Stanley. Elementary Lessons in Logic, pp. 9-l6, 116-117; 126-135, 201-210, and 2l8-27o. 1870 (reprinted in 1.928).. 7. Lyon, T. L., and Bizzell, J. A. Lysimeter Experiments. Cornell Memoir 12. 1918, 8. Noll, C. F. The Type of Problem Adapted to Field Plot Experimentation. Jour. Am'. Soc. Agron., 20:^21-1^5. I.928. 9. Pearson, Karl. The Grammar of Science , pp. l-lpt. 1911* 10. Peltier, G. L. Control Equipment for the Study of Hardiness In Crop Plants. Jour. Agr. Res., 1+3:177-182. 1931 . 11. Richey, F. D. The Convergent Improvement of Seli'ed Lines of Corn. Am. Nat., 61:1+30-1+1+9. 1927. 12. , and Sprague, G. F. Experiments on Hybrid Vigor and Convergent Improvement in Corn. Tech.. Bui. 267. U.S.D.A. 1931. 13. Salmon, S. C. Some Limitations in the Application of Least Squares to Field Experiments. Jour. Am. Soc. Agron., 15:225-239« 1923- Ik. , Principles of Agronomic Experimentation. Kans. St. Agr. Col. (Unpublished Lectures) . 1928. 15. Spillman, W. J. Validity of the Survey Method of Research. Dept . Bui. 529, U.S.D.A. 1917. 16. Standards for the Conduct and Interpretation of Field and Lysimeter Experiments. Jour. Am. Soc. Agron., 25:803-828. I933. 17. Weir, W. W. Soil Science, pp. 11-16. 1936. 18. Wheeler, H. J. Some Desirable Precautions in Plot Experimentation. Jour. Am. Soc. Agron., 1:39-1+1+. I907. Questions for Dis cus sion 1. What is science? 2. What is the syllogism? Give an example. 3^ Why has the syllogism been abandoned in experimental work? k. What is research? Discuss different values of research. 5- How does the inductive method of science differ .from the empirical? 6. Why is it considered desirable to determine basic or fundamental lavs rather than merely to determine what happens? 7. Distinguish between induction and deduction. 8. What part does observation play in research work? What precautions are necec eary in its use? 9- What is an experiment? Discuss its use. 10. What are the principal steps in the inductive method of science? Which ones are most often omitted? 11. Under what conditions is the empirical method justified? 12. Name some types of agronomic tests that are empirical in nature. 13. What serious limitation is true of the empirical method? Ik. What are some reasons for criticism of the methods of research? 27 15. Classify field experiments and describe each class. 16. What place have laboratory and greenhouse tests in agronomic research? 17. How do questionaires and surveys differ? 18. Distinguish between potometer and lysimeter tests. 19. Distinguish between hypothesis, theory, and law. 20. Is it desirable to formulate hypotheses in experimental work? Why? 21. What qualities are necessary in a good hypothesis? 22. What is a working hypothesis? 23. What advantages are there, if any, in formulating multiple hypotheses? 2k. What is the null hypothesis? 25. What is a crucial test? Explain one. 26. Why is research often more difficult in plant sciences than that in the physical sciences? 27. Distinguish between cause and effect. 28. Name, define, and illustrate five different kinds of evidence. 29. What is the most important inductive method in experimentation? Why? 30. What is analogy? Discuss its use and give an example. 31. What is the value of negative evidence? 32. Mention k ways in which discoveries are made. 33* How was the cause of chlorosis found in pineapples in Hawaii? 3^. Mention 3 discoveries and tell how they originated. CHAPTER IV ERRORS IN EXPERIMENTAL WORK I . Types of Experiment al Error Two kinds of error are common in experimental work, systematic errors and chance errors. The investigator needs to be familiar with both kinds. Such errors should be distinguished from mistakes and blunders. For example, a worker makes a mistake when he puts down a weight of 10 lbs. when the scale actually showed the weight to be 20 lbs. (a) Systemat ic Errors Systematic errors occur every time that an experiment is repeated, in the same way. Most experimental plans involve some errors of this kind. For example, suppose that a large number of winter wheat varieties are arranged systematically in single-row plots. Some of the varieties kill out because of lack of hardiness. The varieties in the adjacent rows might yield abnormally high because of the additional space from which they could draw moisture. Such an error would be repeated every time the experiment is conducted in this manner. In this particular case, the com- petition effect could have been avoided by planting three-row plots for each variety and only the center row harvested for yield.' (b) Chance Errors Errors which occur by pure chance with no definite assigned cause are known as chance errors. They are generally small fluctuations due to minor causes. Chance errors may accumulate to produce a sizeable deviation even though it be impossible to foresee and analyze all causes that contribute to them. The principal reason for statistical analysis in agronomic science is its very inexactness and the inability to control chance errors. In case the present theory of plot technique is acceptable, the variations in plot yields are due to chance errors and, in most cases, have been found by experience to be normally distributed. This means that there are a large number of small errors and a small number of large errors. Statistical methods are employed in field, experiments to measure the effect of chance errors. In addition, some systematic errors can be removed by these methods as will be shown later. II. Sources of Error in Experimental Work Evidence gained by experiment is disputed., according to Fisher (1937) either on the grounds that the interpretation is faulty, or on the criticism that the experiment itself is poorly designed. Errors are always possible and. seldom absent in experi- mentation. ( a ) Faulty Design and I nferior T echniqu e_ Experimental designs are inadequate or faulty when they do not afford a proper opportunity for statistical analysis to analyze and measure experimental errors, both chance and systematic. Fisher states: . "If the design of an experiment is faulty, any method of interpretation which makes it out to be decisive must be faulty too." The investigator may fail to take certain variable factors into account Aside from these, various personal errors may have been introduced, such as careless- ness. Farrell (1913) lists a few sources of error in field experiments. Among the controllable ones are. Incorrect weights of crop products, faulty determinations of plot area, variations in quantities of products recovered and wasted, unobserved, variations in field treatments, etc. Among the errors seldom controlled, he cites: Plant variation, soil irregularities, uneven distribution of soil moisture, and tem- perature variations. Frequently, the total effect from all causes is great enough tc influence the conclusions of the experiment. It might be added that some o^ these errors can be measured and their influence on the conclusions removed. -28- 2 9 (b) Improper Interpretation of Results Two common types of misinterpretation of experimental results are drawing conclusions from too few data, and carrying the Interpretation, "beyond the points actually tested. (1) Conclusions drawn from too few data ; An experiment may be inadequately replicated in time and space to Justify the conclusions drawn, Carleton (1909) warns that some experiments are defective because they are run for an insuf- ficient length of time. Sometimes investigators are in too much of a hurry to ob- tain results. Another common mistake is to over -emphasize small differences. Sta- tistical methods have done a great deal towards reducing invalid inferences due to too few data. (2) Interpretation carried beyond points tested : Sometimes the in- terpretation of the results of an experiment is carried beyond the points actually tested. Salmon (I923) believes that one of the chief sources of error in agronomic literature is the tendency to generalize from experiments limited in their scope. For instance, it should be quite obvious that laboratory tests may not always be applied to field conditions. Such generalization must be justified by a similarity of conditions. As an example, suppose phosphates were added to the soil in a fer- tilizer test in amounts of 100, ^00, and 600 pounds per acre. One would be unable to draw conclusions on, , say 1000 pounds, because it is beyond the amount tested in the experiment. It is obvious that a point may be reached where the addition may have a depressive effect. Sievers (1925) points out that recommendations based on variety tests conducted under different conditions as to soil, climate, and weather than those under which the farmer operates are unsatisfactory. III. The Personal Factor Individuals differ greatly in the way they attack problems and carry out the various details connected with them. For example, two men will seldom agree exactly when they make measurements on the same thing because they do not "see exactly alike". Such differences are apt to be more pronounced when personal judgment plays an im- portant role. The mixing, of materials illustrates a situation where individual work- ers may differ in the details of their procedures to an extent that the end-product is affected. Mechanical devices tend to do away with the personal factor. When several individuals work on an experiment it is desirable for the same person to complete an entire operation, or at least for all the treatments in a single repli- cate. For example, in a variety test the same person should plant the plots, harvest them, and make the weights so far as possible. At least, the same crew should carry out the details uniformly for all plots or treatments, preferably for the entire test. IV. Sources of Variation in Field Experiments Certain limitations in plot work must he recognized. To quote Noll (1928): "The most serious are that the experiments must be made under constantly changing condi- tions as to moisture and temperature, and that the average results for a given soil in a given locality, no matter how carefully planned, are not necessarily applicable elsewhere." The most common variations in field experiments are those. due to plants, those due to differences in seasons, and those due. to the soil. The variations that cannot be balanced out can be measured in a well-designed experiment. Some of those due to defined causes, such as. soil heterogeneity, can be removed or balanced in part but not entirely. The variations that are due to unrecognized causes are measured and assigned to experimental, error. (a) Errors Related to the Plant ' \. Variation may be introduced due to differences in acclimatization unless this factor happens to be the one under study. Differences in stand may be a fruitful source of variation, particularly in crops like corn, sorghums, etc., where plant 30 individuality is important. There are less corn plants on a unit area than wheat plants. A further source of variation due to plants is the difference in moisture content of the harvested crop. Correction to a uniform moisture basis is advocated under such conditions. Plant competition may introduce still further error in plot ■results. (t>) Variations in Seasons Climate rather than soil may he the limiting factor in crop production. Some varieties are known to withstand. extreme conditions like drouth or excessive moisture "better than others. From uniformity trials with corn over' a 3-year period, Smith (1909) concluded that more variation in yield could he expected in seasons un- favorable for the crop. For that reason, tests conducted for only one or two years may he very misleading. This situation may he remedied "by the extension of a variety test over a number of seasons to determine the variety that thrives "best in an aver- age season. For a reliahle average of seasonal conditions, a variety test should he conducted for at least three years- and pref erahly more . Under dryland conditions, it takes at least 10 years to secure a reliahle variety average. Variety comparisons should he strictly comparable, i.e., compared only for the same years under test. Usually this is accomplished hy expressing- yields in percent of the standard or check. Other factors that may cause the yields of varieties to vary from season to season are: (1) The plots may he damaged by windstorms one year and not in another. (2) Rodents may cause more damage in some years than in others. (3) Insects may he troublesome in certain seasons, (h) Rust in small grains may reduce yields more in some years than in others. (5) There may ho an inaccuracy in scale weights from one season to another. (6) Carelessness in harvesting or threshing is another factor in -some seasons. (7) Sometimes the planter fails to drill out to the end of a plot with a possible error in yield as a result. (8) Crooked rows may introduce errors in the yields of row crops . ( c ) Errors due to Soil Var iation It is impossible to secure a perfectly uniform soil for field experiments. Differences in productive capacity commonly occur in different portions of the same field. In fact, soils vary in composition arid productivity from foot to foot with the result that it is impossible to say that any soil. Is uniform, even on small areas. However, the investigator should secure as uniform a piece of ground as pos- sible. Sedentary soils are usually more uniform than drift soils, and level land more likely to be uniformly productive than hilly land. Other factors that may in- troduce variation are: topography, under -drainage, sub-soil,; and previous soil management practices. V . Errors in Laboratory- a nd Greenhouse Tests There are many possibilities for error in tests of this kind. Probably the most serious one is to draw conclusions from laboratory tests for field conditions with- out a field test. Laboratory bests should supplement, rather than replace the field ' experiment . « ( a ) Errors in Greenh ouse T e sts ' ' Some of the possibilities for error nay be listed as follows: (l) The number of plants is small. Plant individuality assumes major importance, part icularly when the investigator works with large plants. (2) There may be unequal distribution of water. It is difficult to get a uniform distribution of water through a heavy soil. (3) It is often important that the exact amount of water in a soil be known. This is particularly true in pots for freezing tests. (U) There may be a lack of uniformity in the soil itself. This may be alleviated by thoroughly mixing the soil in a homogenous mass. The mixed soil should be packed uniformly in all pots. (f>.) Berne insects may be restricted only to greenhouse conditions. As a result, the behavior 31 in the field may "be entirely different so far as insects are concerned. (6) There may he a temperature or light differential under controlled conditions. A lack or over-balance of either or both may introduce a systematic error in the experiment. Le Clerg (1935), in a uniformity trial with ^00 small pots in a greenhouse experi- ment, found the per cent of damping-off in sugar beets to be less in the border -row pots on a raised concrete bench than in those farther removed from the heat pipes. The effect was almost absent in a bench provided with wall boards to deflect direct heat. The unequal exposure to light or heat may be corrected in some instances by rotation of the pot table periodically. (b) Comparison of Potometer and Field Trials Data from pot experiments and field trials were found by Coffey and Tuttle (1915) to agree closely in fertilizer experiments. However,, many fertilizer analo- gies from pot tests have led to errors in interpretation. Kezer and Robertson (I927) found no agreement between potometers and field plots in irrigation studies with wheat. Potometers with late irrigation treatments became so dry that the soil pulled away from the edge of the can. When water was added, most of it ran down the cracks and out of reacn of the root systems of the stunted plants. VI. Statistical Methods in Relation to Variation The statistical method is the mathematical means to measure and describe variation and to allocate its component parts to certain recognized sources. Variation can be measured quantitatively thru the medium of an experimental design that takes into account the recognizable sources of variation. The measurement of total variation makes it possible to obtain a measure of that due to all uncontrolled sources. The statistical method concludes its role when it gives the experimenter a means to com- pare the obtained quantitative measures of variation due to the recognized possible causal factors with the variation classified as error and also with each other. Thus, conclusions can be drawn in regard to the relative importance of the sources of variation, :.:".--. VII . Classical Fallacies in Agronomy ■ A number of fallacies in agronomy have been listed by Salmon (1929). Many of these ideas were accepted as facts until rather recently. An analysis of these fallacies shows how each came to be accepted by. agriculturists. ; . ■ (a) Conservation of Moisture by the Dust Mulch The effectiveness of the soil mulch in the conservation of soil moisture has been under discussion for many years. The early work, on which the dust mulch theory was based, was performed in the laboratory. Between I885 and 1900, King (1907) shewed that the dust mulch was quite effective in the reduction of water evaporated from the soil surface. In fact, the water loss was about one-half that from a bare soil. However, King worked in the laboratory with soil in tubes, the water table being only 22 inches from the soil surface. On the basis of this and similar experi- ments has rested the conviction that the soil mulch would reduce evaporation losses and materially aid in the conservation of moisture. This theory was believed and practiced until tests by the Office of Dry Land Agriculture (USDA) proved that it was without foundation. Call and Sewell (1917) showed that the soil mulch failed to in- crease the moisture in the soil. In fact, the mulched plots actually lost more water than bare undisturbed soil. The limit of capillary rise from a free water surface is only about 10 feet, according to the work of Shaw and Smith (1927) . However, they found moisture losses to be quite rapid from unmulched soil where the water table was h to 6 feet from the surface. Other experiments in Illinois, Missouri, and Nebraska kave shown that corn yielded almost as much where the weeds were scraped with a hoe as vVibtpo th/=^ plots were cultivated (mulched). Shaw (1929) reworked King's experiment 32 using soil tubes k feet high, and maintaining a constant water table- at the bast- of each. The loss in the mulched tube was 38 per cent less than that from the tube in which the soil was left bare-. This test merely confirmed the fact that the results from these soil tubes could not be applied to field conditions where the free water surface is usually more than 10 feet from the soil surface. Under dryland conditions where moisture conservation is ertrcmely important, the water table is very often 200 to J+00 feet from the surface. (b) Deep Plowing for Mo isture Conservation The theory that very deep plowing will save moisture by an increase in the storage volume of the soil is an old one that dates back to about 1880. It was some- times advocated that the soil be stirred from Ik to 18 inches deep. Deep tillage was widely advocated on the Great Plains along about 1910 by Hardy W. Campbell. Most of the implements used were soon allowed to rust out in fence corners. Experimenta- tion very quickly showed that deep tillage (Ik to 18 inches deep) was impractical or actually depressed the yields under dryland conditions. Brandon (1925) found. that winter wheat grown on plots subsoiled every two years .actually yielded 1.3 bushels per acre less as a 15-year average than wheat on land plowed at ordinary depths. Similar results were obtained in Wyoming by Nelson (1929). ( c ) C ontinuous Selec tion of Sma ll Gr a iris It was believed at one time that continuous selection was a means to invaria- bly improve small grains. After 50 years of continuous selection, Vilmorin concluded that no improvement had resulted in wheat, a self -fertilized crop. The pure line theory worked out by Nillson-Ehle and by. Johannsen showed that selection was effect- ive only in heterozygous material. This old idea on the value of selection was probably due to a disregard of the difference between self and cross-fertilized plants. ( d ) Selection of Seed Corn by Sc ore - Card S t andar ds Arbitrary score card standards were improvised in the early days as ideals for seed selection in corn. These standards laid stress on such points as shape of kernel, length of kernel, ears with well-filled butts, and tips, percentage of grain on the cob, weight of ear, etc. Uniformity of' jars was particularly stressed. The height of the belief in the "pretty ear" was reach xi about 1910 when the most "per- fect" ear at the National Corn Show sold for several hundred dollars. When planted in the field in comparison with ordinary ears, it failed to surpass them either in yield or quality. This started a great amount of research on the relation of score card points to yield. It was generally proved that such arbitrary standards are of little value. In fact, close selection for type was generally shown to result in an approach to homozygosity with a reduction in yield and vigor as a consequence. Some of the investigators who aided in the. upset of this theory were: Cunningham (I9I6); Love and Went z (I917); Olson, Bull and Hayos(l9l3) ; Kiesselbach (1922); and Richey (1925) ( e ) C al c i um-Magne s t urn Bat i o in Soils A physiological balance seems to be necessary in nutrient solutions for a. normal plant growth. In IS92, Loew proposed the calcium magnesium' ratio hypothesis. He worked out the optimum ratio for a number of different plants in water cultures. He concluded that either calcium or magnesium used alone was toxic, but that the toxicity disappeared when these elements fell within certain limits. The ratios which Loew used varied from 1 CaO : 1 MgO to "( GaO : 1 MgO . A large amount of inves- tigation has been conducted on this ratio in which it has been shown that a rather definite ratio of CaO to MgO Is required in nutrient solutions for optimum plant growth. The same applies to other nutrient elements" as well. However, there appears to be little evidence to support the necessity for a definite ratio of CaO to MgO in soils. Recently, Moser (1935) reported that the ratio itself showed no relation to 33 crop yields. The "beneficial effect of lime added to the soil was attributed to the increase in replaceable calcium rather than to an alteration of the calcium-magnesium ratio. It is sufficient to state that Loew conducted his experiments with water cul- tures which probably react differently from soils. . (f ) Addition of Burnt Limestone to the Soil It is still believed by some farmers that the addition of burnt limestone to the soil results in a destruction of organic matter and an increase in the soil acid- ity. That burnt limestone increased the acidity was reported by the Pennsylvania Experiment Station. The theory, as taught, was based on small analytical differences in soil analyses. (g) Acid Phosphate and Soil Acidity The use of green manure and acid phosphate was at one time said to increase soil acidity. Grass and green material were known to decay and give an acid under laboratory conditions. Careful work under field conditions has shown that bacteria use up the organic acid formed. Acid phosphate was thought to increase soil acidity because of the name. It has been changed to superphosphate recently for psychologi- cal reasons. References 1 Brandon, J. F. Crop Rotation and Cultural Methods at the Akron Field Station. Dept. Bui. 130U, USDA. 1925 . 2. Call, L. E., and Sewell, M. C. The Soil Mulch. Jour. Amer. Soc. Agron. 9:^9-6l. 1917. 3. Carleton, M. A. Limitations in Field Experiments. Proc. Soc. for Agri . Sci., pp. 55-61. 1909. k. Coffey, G. N., and Tuttle, H. F. Pot Tests with Fertilizers Compared with Field Trials. Jour. Am. Soc. Agron., 7:128-135* 1915. 5. Cunningham, C. C. The Relation of Ear Characters of Corn to Yield. Jour. Amer. Soc. Agron., 8:188-196. 1916. 6. Farrell, F. D. Interpreting the Variation of Plot Yields. CIr. 109, BPI, USDA, pp. 27-32. 1913. 7. Fisher, R. A. Design of Experiments, pp. 1-12. 1937. 8. Kezer, A., and Robertson, D. W. The Critical Period of Applying Irrigation Water to Wheat. Jour. Am. Soc. Agron., Vol. 19, No. 2. I927. 9. Kiesselbach, T. A. Corn Investigations. Nebraska Agr. Exp. Sta. Res. Bui. 20. 1922. 10. King, F. H. Physics of Agriculture. 1907 . 11. Le Clerg, E. L. Factors Affecting Experimental Error in Greenhouse Pot Tests with Sugar Beets. Phytopath., 11:1019-1025. 1935- 12. Lipman, Chas . B. A Critique of the Hypothesis of the Lime-Magnesia Ratio. Plant World, 19:83-105, and 119-135.' 1916. 13. Love, H. H. and Wentz, J. B. Correlations Between Ear Characters and Yield in Corn, Jour. Amer. Soc. Agron., 0:315-322. 1917 . Ik. Moser? F. The Calcium -Magnesium Ratio in Soils and It 3 Relation to Crop Growth. Jour. Amer. Soc. Agron., 25:265-377. 1933. 15. Nelson, A. L. Methods of Winter Wheat Tillage. Wyo. Agr. Extd . Sta. Bui. lol, 1929. 16. Noll, C. F. The Type of Problem Adapted to Field Experimentation. Jour. Am. Soc, Agron., 20:^21-1+25. 1928. 17. Olmstead, L. B. Some Applications of the Method of Least Squares to Agricultural Experiments. Jour. Amer. Soc. Agron., 6: 190-204, 191U. 3h 18. Olson, P. J., Bull, C. P., and Hayes, E. K. Ear Type Selection and Yield in Corn. Minn. Agr. Exp. St a. Bui. l*jk. 1918. 19. Pichey, F. P. Corn Judging and the Productiveness of Corn. Jour. Amor. Soc . Agron., Vol. 17, No. 6, 1925 . 20. Salmon, S. C. Principles of Agronomic Experimentation (Unpublished lectures) Kansas State College. 1929. 21. Some Limitations in the Application of the Method of Least Squares to Field Experiments. Jour. Amor. Soc. Agron. 15:225-239. 1923* 22. Shaw, C. F. When the Soil Mulch Conserves Moisture. Jour. Amor. Soc. Agron., 21:1165-1171. I929. 23, , and Smith, A. Maximum Height of Capillary Pvise Starting with Soil at Capillary Saturation. Hilgar&ia, 2:599-409. 1927- 2k. Si overs, F. J. Outstanding Weaknesses in Investigational Work in Agronomy. • Jour. Am. Soc. Agron., 17:88-69. 1925 . 25, Smith, L. H. Plot Arrangement for variety Experiments with Corn. Proc. Am. Soc. Agron., 1:84-39. 1909. Que sticn s_ for 1 )is ci'.asiori 1. Distinguish between chance and systematic errors. 2. What errors in field experiments can "be controlled? 3. What kinds of errors in field, experiments are not controlled? How are they mini- mized? h. What errors can be made in the interpretation of experimental results? 5. How may the personal factor influence experimental results? 6. What are the general sources of variation encountered in field experiments? 7. What factors cause plot yields to differ from season to season? 8. What errors may occur in greenhouse tests? 9. How did the soil mulch theory originate and, in the light of present knowledge, how might the error have been prevented? 10. Is there any experimental or scientific basis for the belief that very deep plow- ing (10 inches or more) is profitable? Explain how this idea originated, 11. How did the belief that good seed corn is characterized by deep, rough kernels, and cylindrical ears originate? 12. What was the basis for the belief that a certain calcium-magnesium ratio was necessary for plant growth? 13. Explain the origin of the idea that burned lime decreases organic matter in the soil . lU . Is there any reason to believe that acid phosphate or green manure increases soil acidity? Why was it thought they did? 15. Make a general statement which will explain the sources of error that have o c cur r e d in agr onomi c s c i e nc e . FIELD PLOT TECHNIQUE Part II Statistical Analysis of Data CHAPTER V '"■'• , . FREQUENCY DISTRIBUTIONS AND THEIR APPLICATION I- Measurements and Collection of Data Quantitative data, collected as a result of measurements, are widely used in research ■work. To measure a quantity is to determine "by any means, direct or indirect, its ratio to the unit employed in expressing the value of that quantity. (Weld, 1916) . Every measure has some sort of linear scale, either straight or curved, on which the magnitudes are read. This is because the human eye can measure length far more ac- curately than it can most other magnitudes. However, the investigator should realize that there is no such thing as an exact measurement. Seldom will a re-weight or re- measurement give exactD.y the same quantity because of inaccuracies that arise from imperfect apparatus and judgment in estimation. An observer may tend to over-esti- mate, or his measurements may be prejudiced, or his judgment may fluctuate. Because it is next to impossible to arrive at a true value, measurements should be made as carefully as possible in order to obtain the closest approximation. The units of measure will depend upon the degree of precision required in the worlc. One should distinguish between errors and inaccuracies due to carelessness. These are more properly called mistakes. They consist of blunders like reading the wrong number on the scale, recording a figure in a notebook wrong, forgetting to deduct tare, etc. It is much easier to check the accuracy of weights when they are made more than once. Sternal vigilance and care are necessary to reduce mistakes to the minimum. The in- vestigator should realize that it is impossible to evolve sound results from unsound or carelessly collected data merely thru the application of a formula. II. Statistics in Experimental Work After data are collected, it becomes desirable to describe them, interpret them, and induce from them. This is the realm of statistics. = , .■'■•■■■."-. (a) Statistics Defin ed Statistics may be regarded as the mathematical analysis based on the theory of probability applied to observational data in an attempt to summarize and describe them so that conclusions can be drawn concominr the phenomena that supply the data. Fisher (193^) states that the original meaning of statistics suggests it was a study of populations of human beings living in political union. The methods developed, however, have little to dc with political unity. In fact, they are applied to popu- lations, animate or inanimate. (b) Use of Statistics .:.• ■ ■; Statistics are used in astronomy, biology, genetics, education, psychology, and many other fields. They aro particularly applicable to data concerned with life or the products of life. Probably 75 to 80 per cent of the agronomic workers in agricultural experiment stations use statistical methods, although only about one- half of these apply statistics to other than yield data. However, statistical methods are being used more extensively as time goes on. ■:-.'.• III. Some Typical Statistical Term s The effort to characterize and describe the data mathematically leads to the calcula- tion of various statistics ..The simplest of these is the average or mean. It is natural for the first step to be an attempt to find a single measure which will best describe the sum total of the information expressed In a mass of data. The best single measure is the mean. However, it fails to tell the entire story. Among the other statistics are the median, mode, average deviation, standard deviation, coeffi- -37- 38 cient, and correlation ratio. Among the derived values from these stat:l"tic, t : are their standard errors and probable errors employed in the important problem of esti- mation and prediction. (1) By a variable is meant any organ or character which is capable of variation or difference in size or kind. This difference may be measurable as in height, tem- perature, weight, etc., or indirectly as in the case of color, occupation, etc. Variation may be continuous or discrete*. For example, a temperature change from 60 to 6l degrees must pass continuously through every intermediate state between 60 and 6l degrees. On the other hand, variation may take place by integral steps without intermediate values, as in population which can never go up or down by less than one. (2) A variate (x) is an individual value of a variable, e.g., 3 feet, 200 grams, 15 pounds, etc. (3) The frequency (f) is the number of times a particular variate (x) occurs between two limiting values of a variable, i.e., the number of variates in any one class, (k) A population is the totality of individuals which are to be studied with regard to a character and may be finite or infinite. (5) A sample may be all or a part of a population. A random sample is a sample taken in such a way that all individuals which make up a population have an equal chance of being included in the sample . IV. Rules_ for Computat Vn It is desirable to be consistent in the number of decimal places used in computations, and in the manner of dropping decimals. Suppose it is desired to retain two decimal places. For a number like 82.575; the value can be made 82.58 by raising the odd number to an even number. However, when the digit in the third decimal place is greater than 5> the number is added, but dropped when it is less than 5- For the square root of a quotient to be accurate to two decimal places, it is recommended that the quotient be carried to four decimal places. This is especially important where the square root is to be used in multiplications for other computations. V. Arithmetic Average or Mean Masses of unorganized data explain little or nothing. Individual measures are less significant than a typical value which stands for a number of measurements. An average or mean is such a value. It is the single constant most commonly employed to describe the sample. (a) Simple Arithmetic Mean The mean may be considered the center of gravity of a sample. It is equal to the sum of the individual measurements divided by their number. x = x-| + X2 4- X3 + x n or x = Sx (l) N N where Sx = the sum of all the variates, H = the total number of variates, x = the arithmetic mean, and x-i, ^ . . . .x n . the individual variates. For example, the yields of Golden Glow corn on J -plots were 8V. 8, 86.9, and 89.9 bushels per acre. The arithmetic mean would be: x = 8^.8 + 86.9 + 89-9 = 87.2 5 *Note: This usage is somewhat different than that in Genetics where a discontinuous variation refers to a germinal change that breeds true, while a continuous -variation applies to variations due to environment and non-heritable. Hi 39 (b) Mean of Beplicated Variates 1 It must be remembered' that the weight of each variate must be equal in the sample. When certain variates .are repeated, the computation may be shortened by merely considering each distinct variate multiplied by the number of times it appears. Suppose 7 corn plants of variety "A" were measured for height in the first replica- tion and were found to average 59 inches. In the second replication, 3 plants were measured, and averaged 67 inches. A total of 20 plants were measured for height in a third replication and found to average 54 inches. Suppose one desired to know the average height for the variety. A simple arithmetic mean of 59> 67, and 54, (i« e -> 60 inches) would be incorrect because a different number of plants made up the origi- nal means in the different replications. The mean must be calculated so as to give due weight to each variate for the number of times that it occurs. For instance, the mean may be calculated as follows: * = (59 x 7) + J2k * 20) x ( 6 7 x 3) = 169^ = 56.47 in. 30 . "30 The same result may be obtained by the addition of the original yd variates and dividing by 30. VI . The Frequency Distribution . The mean for replicated variates may be calculated from a frequency table which is a simple device by which a considerable quantity, of data may be organized in condensed and classified form. Some data presented by Goulden (3-937) on the yields in grams of 400 barley plots will be used to illustrate the frequency table. The yields which . follow represent an aggregate of data in which there are 400 variates. Each measure- ment is a variate, i.e., a particular measured value of the variable (x) yield. Yields i n Grams of 400 Squ are. _Yar d_ Plots o f Barley^- 135 162 I36 157 l4l 130 129 176 171 190 157 1^7 176 126 175 13^ I69 I89 180 128 169 205 129 117 144 125 165 170 153 186 164 123 165 203 156 182 164 176 176 150 216 154 184 203 166 155 215 190 164 204 194 148 162 146 174 185 171 181 158 147 165 157 180 165 127 186 133 170 134 177 109 169 128 152 165 139 146 144 178 188 133 128 161 160 167 156 125 162 128 103 116 87 123 143 130 119 141 174 157 168. 195 180 158 139 139 168 145 166 118 171 143 132 126 171 176 115 165 147 186 157 187 174 172 191 155 169 139 144 130 146 159 164 160 122 175 156 119 135 116 134 ■•', 157 182 209 136 153 160 142 179 125 149 171 186 196 175 189 214 169 166 164 195 189 108 118 149 178 171 151 192 127 143 158 174 191 134 188 248 164 206 135. 192 147 178 189 141 173 187 167 128 139 152 167 131 203 231 214 177 161 194 141 161 124 130 112 122 192 155 196 179 166 156 13'L 179 201 122 207 189 1.64 131 211 172 170 140 156 199 181 181 150 184 154 200 I87 169 155 107 143 145 190 176 162 123 189 194 146 2£ 160 107 70, "34 112 162 124 136 138 101 138 l4l l43 135163 1S3 99 118 150 151 33 136 171 191 155 164 98 136 115 168 130 111 136 129 122 120 179 172 192 171 151 142 193 174 146 180 140 137 138 194 109 120 124 126 126 147 115 148 195 154 149 139 163 118 126 127 139 174 167 175 179 172 174 3.67 142 169 122 163 144 147 123 160 137 161 122 101 158 103 119 3.64 112 57 ' > (§3) 106 132 122 164 142 155 147 115 143 68. 184 183 167 160 138 191 153 loO 156 122 111 153 - ] -43 103 131 180 142 191 175 146 101 ••111 ■• 110 154 176 168 175 175 146 148 167 106 123 121154 148 91 93 74 113 79 131U9 96 86 97 98 106 107 69 86 94 129 /r " • ■ :-\ ':.,' Mi- ** l-This has been sometimes called a "weighted" mean. ;■'■'.•; 2 Data from Methods of Statistical Analysis by C. H. Goulden, p. 7, 1937. ko : (a) Grouping of Data into Classes The above data are unwieldy in their present form, even though quite. simple in nature. They may he condensed by grouping. First, find the highest and lowest values of the variates (barley yields in grams). The interval thus defined by these extreme values is known as the range. In this case it is 22 to 2l+8. The next step in the formulation of a frequency.' - table or distribution is to separate the range into classes. Although unnecessary, it is usually convenient for the classes to have equal range (interval) within themselves. The number of classes to be formed is the next question. Experience has shown that, somewhere between 7 and 20 classes is a desirable number with which to work. The smaller the number of classes the greater is the error due to grouping. The approximate number of classes can be determined from a formula given by Yule (1929) : Number of Classes = 2,p yNumber in Sample = 2.5 hJkOO = 11.18 Suppose 12 classes are decided upon. The quotient of the range divided by the number of classes is the approximate class interval, viz,,, 226/12 = 18. ' However, a class interval with an odd number is more convenient because the midpoint of the range does not require an additional decimal. Suppose 19 is selected as the class interval. The value of a class is taken at its mid -value . The barley data may be tabulated for a class interval of 19 as follows: Class Range (gm) Class Value (x). (gm) . I ( & - M£ 2J Tabulation Frequency (f) (No.) 22-1+0 kl -59 60-78 79-97 98-116 117-155 136-15,1+ 155-173 17I+-192 193-211 212-230 231-21+9 31 50 6q 83 107 126 li+5 161+ 183 202 221 2 1+0 1 1 1111 itia mi 11 IHl 11-11 IHI 11-11 Kbil mi 1 (This tabulation can be continued in Like maimer for the other variates.) 1 1 1+ 12 31 69 80 97 78 21 k . N = 1+00 (b) Frequency Table After the data are tabulated they are next arranged 'in a frequency table, i.e., the frequencies are entered to correspond to their class values. x IX 31 1 31 50 1 50 69 k 276 38 12 1,056 107 31 3,317 126 69 8,691+ ll+5 80 11,600 161+ 97 15,908 I83 73 Ik, 27k 202 21 k,2k2 221 1+ S3h, 21+0 2 1+80 H = 1+00 S(fx) = 60,812 The mean (x) for this sample can be con- veniently calculated from the frequency table. Each class value is multiplied by its frequency (f) to five fx. These values are summed to give S (fx) and divided by the total number in the sample. For the barley yi H o"! G.lCLS ■ X 60 ,812 == "1+00 .Vii ■'O S(fx) ; = N It should be evident that the classification of the data into a frequency distribution ha? distorted them from their original form. kl ■ (c) Graphical Representation of Frequency Table . A visible representation of a large number of measurements is afforded "by either a histogram or a frequency polygon. The histogram is most commonly used. The character to be measured is repre- sented along the horizontal axis (abscissa), while the frequencies are represented vertically (Ordinate) to correspond to each class. For example, the barley yield data may be plotted as follows: ■ f 100 80 60 ho 20 Frequency (ordinate) — i-* — . . L^h— r~ T7l~--i 31 50 69 88 107 126 1U5 16 J + 183 202 221 2^0 Class Values in Grams (abscissa) x The frequency polygon is constructed by joining in sequence the midpoints of the tops of the bars of the histogram. Its shape tends towards the smooth curve of the population from which the sample was drawn. The frequency polygon for the barley yield data is as follows: 100 80 60 Frequency (ordinate) ho 20 X 31 50 69 88 10? 126 1U5 l6k 183 202 Class Values in Grams (abscissa) 221 240 VII. Measures of Central Tendency There are three measures of central tendency that must he defined at this point. (1) The arithmetic mean, already discussed, is the center of gravity of the popula- tion. (2) The median is the measure of the middle variate in an ordered arrangement of the variates according to magnitude. (3) The mode is the measure of the class of greatest frequency, or the point at -which the most variates occur. In other words, it is the x-value at which the frequency polygon has the highest ordinate. VIII . Types of Frequency Distr ibutio ns Before one goes further with the analysis to describe the nature of the aggregate of the data, it is necessary to roughly determine the type of frequency distribution. Some mathematical expression is essentia.! corresponding to those types most often en- countered in actual practice. (1) A great many frequency distributions found in practice are unimodal, i.e., have one peak. (2.) There is a general tendency for them to be hell -shaped when the frequency polygon or diagram is smooth. It -was early- noticed that the curve derived from tin., theoretical distribution of the expansion of a "binomial, (a -<- h) n , possessed many of the same characteristics of frequency distri- butions met with in actual practice. However, the "binomial distribution fails to represent continuous variation. An effort to find a mathematical equation for a curve which would well fit the points of a binomial d:i stribiat ion led to the discovery of what is known as the normal probability curve and its equation. Types of distri- butions most commonly approached in thu graphical representation of data are the normal, binomial, and the Poisson distributions. (a) NormaJ- Distributi on The normal curve is a bell-shaped, symmetrical curve. It is characterized by the symmetrical arrangement of the items around the central value. The arithmetic mean, median, and mode coincide in the norma,! curve. As in the case of mam' - frequency distributions, the small deviations from the central value (mean) occur more frequent- ly and the larger deviations less frequently. Fisher (193*0 gives the statures of 3 375 women in a. curve that closely approaches, a normal curve. Number in each group 55 200 /T\ 150 / ■ 100 J: \ \ 50 / I ^y \ p I 59 61 $3 65 67 Height in Inches 69 73 (b) B inomial D i st ribut i on The binomial' distribution is represented by the expansion of the bin (p + l) n . To understand the application of the binomial distribution to da first necessary to make some study of probability. This subject will be tre omiai. ta, ated X is later. *Note: (p + q) n = p n + n-p n_1 q ■*■ n(n-l) p^-2 q 2 + n(n-l)(n'~2'. 1.2.3 ■v r??~ (c) Poisson Distribution The Poisson distribution is biometrically unsymmetrical, i.e., it is extreme- ly skew. This type of distribution results from an attempt to represent the expan- sion of (p + q) n when "p" is extremely small. This type seems particularly applic- able to purity and germination counts in seed testing, as well as many other appli- cations. • (3) Other Types of Distributions Sometimes two or more factors influence the shape of a frequency distribution so that it has two peaks. This would be a bimodal curve. When the data which pro- vide two unimodal frequency distributions with two substantially different means are combined into one frequency distribution, the distribution that results may be bimo- dal due to the fact that nonhomogenous data over-lap.* This happens occasionally in genetic data. IX. Some Constants used to Describe Distributions There are several constants or statistics used to describe distributions. Those of position or central tendency (mean, mode, median) have been discussed already. The constants commonly used to measure dispersion of the variates are the standard devia- tion, quartile deviation, and the average deviation. (a) Standard Deviation The standard deviation of the sample (s') is most frequently used in statis- tical work to measure dispersion. It is sometimes called the standard error of a single observation. The squared standard deviation (s')2 is the sum of the squares of the deviations from the mean divided by the number. This is sometimes called variance, or the second moment about the mean. (s')^ (variance) = u 2 = S(x - x)^ _„_ (2) N where u 2 is the second moment. The standard deviation (s') is the square root of the variance. The formula, for the standard deviation may be expressed as follows: = l ag + dg + d§ + . . .ag = /sa£ or st£ ll™™L™ (3) where d is the deviation from the mean, e.g., &i » x-j_ - x. The above formula gives the standard deviation of the. sample about its mean. When it is desired to use this result as an estimate of the standard deviation of the popula- tion (s) about its mean (m), N-l should be used in the denominator instead of N. This makes little difference in the result when the sample is large, but N-l should be used when the sample is small, ile., when N is less than ">0 as an arbitrary rule. As an example, the calculation of the standard deviation of a sample (s') can be illustrated with the barley yields as grouped in VI (b) above. The deviations for each class are taken from the actual means, i.e., 152. *Hote: Pearson's generalized frequency curves or the Gram-Charlicr method of curve- fitting should be used for a finer method of analysis for such distributions. 1*1* fx 31 1 31 50 1 50 69 k 276 38 12 1056 107 31 3317 126 69 869!+ 3.1*5 80 11600 161+ 97 15908 183 78 11+271+ 202 21 1+21*2 221 1+ 881+ 2lf0 2 1+80 N = 1+00 B(fx) = 60812 X — Ojla — 6 1)812 = 152 .0 s» N +00 = /Sf 12 /3910 : ^C V N J 1+00 jl -121 -102 -83 -6k -1+5 -26 -7 12 31 50 69 88 fa -121 -102 -332 -768 -1395 -179^ -560 1164 21+18 1050 276 176 Sfd< o fM-- ' ll+61+l 101+01+ 27556 U9132 62.775 1+661+1+ 3920 13963 71+958 52500 1901*1+ 15488 391050 - 31 .27 Thus, 31*27 is the standard deviation (s') of this sample. However, the best esti- mate for the standard deviation for the population (cr)from which this sample was drawn , would be:! = /sfd 2 V N-I 32^052 =31 . 51 Another formula for the calculation of the standard deviation of the sample (s f ) has been recommended by J. Arthur Harris for machine calculation: s ' = bZ 1 57 W This formula is essentially the same as the one given above except that the variates themselves are used rather than their deviations from the mean. 2 The calculation of the standard deviation of the sample (s') by this formula is illus- trated with the barley yield data as follows: Note: x The estimate (s) of the population standard deviation (a) may be computed from the sample standard deviation (s 1 ): Note s = s a = x - x Sd d = Sx d Sd 2 = Sx£ N N IT irti ,2 N-l Sx c TJ Sxf If/ d, d = (x - x) ^^2 x" 2x\x x 2 _o Sx + Hxr (remembering that Sx = Wi) ■ Sx^ + Zc-. (remembering that Sx = x . ) N 2xSx + Nx^ = Sxf N N N Therefore, /Sd 2 N Sx 2 x 2 ,2 V N 1 q Y '\2 if" ^ (1) (2) (3) fx 2 X f fx 31 1 31 961 50 1 50 2500 69 k 276 190M+ x = Sf x = 60 , 812 k i+oo 88 12 1056 9292& 107 31 3317 35^919 = 152.03 126 69 8694 1095W- Note : Multiply column No. 1 11+5 80 11600 1682000 by column No. 3 to 16* 97 15908 2608912 obtain Sfx 2 < 183 78 1^274 26l2li+2 202 21 1+2U2 85688U 221 U &m 19536' ] + 2^0 2 %Bo 115200 : S (fx 2 ) N = i+00 s (fx) = 60812 9636298 = 8' /Sfx2 W (636298 (6o8l2\ 2 Uoo \ koo 1 31.27 = / 2J +, 090. 7^50 - 25,113.1209 = 31.2 (b) Coefficient of Variability (C.V.) is the standard deviation of the sample (a 1 ) expressed in percentage of the mean. This gives a relative measure of disper- sion so that variation may be compared in features expressed in different units of measurement. It would be often impossible to compare the variabilities of two ex- periments unless it was expressed in a common unit. The formula is as follows: C. V. (Coefficient of Variability) = 100 s ' - (5) x For the barley data, it is as follows: C. V. = (31.27)(100) _ go 57 152.03 X. Sheppard's Correction for Grouped Data '".,, An error is introduced by grouping variates into classes due to the fact that the midpoint of the class is likely to deviate from the mean of the distribution by more than the mean of the variates grouped in the class in question. This is particularly true for the extreme classes. The majority of the variates in a class are grouped on the side nearest the mean of the distribution. This error can be compensated for mathematically by the use of Sheppard's correction. This correction is equal to l/l2 of the class interval (C), and is subtracted from the value of the squared standard deviation (s 1 ) 2 as ordinarily obtained, i.e., (s*) 2 - C 2 /l2. However, Sheppard's Correction is applicable only to large samples where the. variables are continuous. To calculate the standard deviation without Sheppard's Correction, is to assume that the variates in each class are grouped with the highest frequency at the mean of the class as shown in the diagram. To do this evidently leads to an error in that s' will be computed larger than it actually is. Sheppard's Correction compensates for this type of error which results from grouping data in a frequency distribution. 46 XI. Short -Cut Methods for Computation of Statistl OS So far the statistics for simple frequency distributions have been calculated. Sever- al short-cut methods are used which greatly reduce the labor of computation. These methods give the same results. Usually the computations are made from an arbitrary origin or guess mean (v), with the guess mean corrected to give the true mean (x) of the sample. The guess mean can be taken at any position. - Usually it is taken at the middle of the range or at the lowest class. The method of computation by use of an arbitrary origin, or guess mean, can be shown with the barley yield data. d fd fd £ 31 50 69 88 107 126 145 164 183 202 221 240 1 1 4 12 31 69 80 97 78 21 k 2 1 2 3 4 5 6 7 8 9 10 11 i 8 36 124 345 480 679 624 189 4o 22 .0 1 16 108 496 1725 2880 4753 4992 1701 4oo 242 N = 400 Sfd = 2548 Sfd 2 = 17314 1 Note: It can be readily proved that a guess mean can be used provided a correction is applied to obtain the true mean. Let x = the true mean, w = the guess mean, C = a constant (class interval), d = the deviation from the guess mean, and N = the number . x = Cd + v x = Sfx = Sf (Cd + w) = W N - Cd + w C S(fd) + w s(f) N N since S (f) = N 47 Symbols: w = guess mean,' d or Sfd » correction to the guess at the mean, and C = class interval. S = Sfd = 2548 = 6.37 H 400 x = v (guess mean) + CcL . = 31 + (19) (6.37) =31 + 121.03 = 132.03 8' = C /sfd 2 - -o cr- or C /sfd 2 - /Sfd\ 2 V N \ H J = 19 /i7^. "V 4oo = (19) (1.6456) - /2^8\ 2 \ Uoo7 = 31.2664. 19 / 43.2850 - 40.5769 Sheppard's Correction: s' (corrected) = /(s') 2 - _P_ 2 = fan .5&lQ - ?6l V 12 V 12 = /9T7.5878 - 30.0833 = / 947-5045 = 30.7816 c. v. = loo s* = (30.7816) (100) = 3078.16 = 20.2471 x 152.03 152,03 The arbitrary origin in this case was taken at the first class. The calculation in- volves larger numbers than when taken near the center of the range, but all numbers are positive. ■'■• XII. General Applicability of Statistical Methods Knowledge of the frequency distribution Isads to an elementary insight into the sta- tistical process. The methods of statistics must be applied with caution to experi- mental data. '•"■ ■; '• - .'.- . .-..,. •.-'••• (a) Mathematical Basis for Application The methods- of statistics comprise the application of the solutions affected by the calculus of probability to precisely stated mathematical problems in the attempt to answer questions connected with actual experiments. For the methods of statistics to validly apply to the practical problems connected with experimental work it is necessary that a high degree of correspondence exist between the realities observed in phenomena and. the abstract but very definite concepts upon which the mathematical solution of the problem is based. The one possible way to be certain of a correspondence is to carry out repeated random experiments. Statistical methods may be employed to answer questions and test hypotheses that concern phenomena ob- served in experimental work when this correspondence is satisfactory. The principal cause of the misapplication of the statistical method is the fact that it is often merely assumed that a correspondence exists between measurements and observations concerned with phenomena that result from experiments and the abstract concepts of the probability theory employed to produce the statistical method used in the inter- pretation of the experimental results. (See Keyman, 1937) • ■■ kd '■■'■ -. ; . (b) Value of the Statistical M ethod There are many advantages attributed to the use of the statistical method. (1) It provides a sound "basis for the formulation of experimental designs. Goulden (1937) makes this statement: "The experiment that has "been correctly designed gives maximum efficiency, an unprejudiced estimate of the errors of the experiment, and yields results not only on the primary factors with which the experiment is concerned, "but also on the important inter-relations of these factors." (2) It tends to elimi- nate the personal equation, i.e., it does away with differences 'in personal inter- pretation. (3) The statistical method is useful in the reduction and condensation of data. Fisher (193^) states- that no human mind is able to grasp in its entirety the meaning of any considerable quantity of numerical data. It allows one to express relevant information by means of comparatively few numerical values, (k) It affords a means to measure and evaluate chance errors. This is probably the outstanding con- tribution of statistics. (5) The statistical method affords one of the best measures of concomitant variations, i.e., correlation. (6) It gives a quantitative measure of variation, including chance variation. Statistics are widely used in genetics for this purpose. . .. (c) Reliability of the Stat istical Cjmst_arrt The reliability that can be placed on statistical constants depends, in many cases, on the type of data being analyzed. However, several factors contribute to reliability, (l) Reliability depends on the accuracy of the measurements. (2) Quan- titative data are likely to be more accurately measured than qualitative data. (3) Samples collected at random are usually more reliable than those selected by other means, although samples by design in planned arrangements are very good. • (k) A large sample is more likely to be representative than a small one. Arbitrarily, pop- ulations of less than 100 individuals or variates ordinarily are considered small samples to which special precautions should be applied. (Fisher, 193^) • Conclusions drawn from many of the older field experiments are questionable because there were too many different kinds of treatments and too little replication or repetition. Statistical methods have done much in recent years to increase the reliability of field experiments. The difficulty of small samples has been alleviated in many in- stances by the calculation of a generalized standard error based on all the plots of the experiment. Harris (1930) claims that many agronomic experiments can be organ- ized to "make possible the application of the powerful methods of 'biometric descrip- tion and analysis." (d) Some Misconceptions o f th e Statistical Method ■„ There is little question about the value of the statistical method as such, but much question as to- its application. The statistical method cannot correct poor technic or be applied indiscriminately. The standard error of a statistical constant fails to measure the accuracy of an experiment unless, all errors ('personal equation) have been eliminated except those due to chance. The statistical method may eliminate some systematic errors, but to no great extent. An effective way tc eliminate syste- matic errors, or at least to discover them, is to repeat the experiment in a different manner. Statistics may lend support, to a hypothesis but does not necessarily prove it. Several 'years a.go, arguments on the use of statistical methods in agricultural research were quite common. The mathematical foundations .of the statistical formulae are now regarded as well established, but argument on the proper application of cer- tain statistical measures will continue much as it does in experimental technic generally. Blind application of statistical procedures, as with any other technic, is harmful. Common sense and good judgment. are vital, in all phases of experimental work. Salmon (1929) points out that statistical treatment in Itself is seldom satis- factory because: (l) The observed result may not be due to the assigned cause. (2) The laws of chance are often an unsatisfactory basis for action or for specific advice. (3) Many experiments do not furnish results which readily lend themselves .. .. .'. ■'„•', !»? to statistical treatment because of bias, lack of- randomness, or paucity of the observations, (k) Most experiments furnish evidence supplementary to the main issue which is of the greatest value for the arrival at a reasonable interpretation of the results. This type of statement is answered by Goulden (1937) who "doubts very ser- iously the contention that all really worthwhile effects are obviously significant. At any rate this is at best a dangerous concept as evidenced from scores of examples in published papers where conclusions have been drawn that can be proved by the data to have very little foundation Thus, the experimentalist who states that his results are so obvious that they do not require tests of significance is merely stating that in his experience with such experiments, differences as great as those obtained are very unlikely to have arisen by chance variation. We have no quarrel with this reasoning in that it is exactly the type of reasoning employed in tests of significance. Our contention is merely that a determination of probability based on a measure of variability furnished by the experiment itself is sound experi- mental logic and vastly superior to any method based on pure guesswork." . References 1. Fisher, R. A. Statistical Methods for Research Workers- Oliver and Boyd. (5th edition) pp.. 1-7, and p. k9. 193^. 2. Goulden, C. E. Methods of Statistical Analysis. Burgess Publishing Co., pp. 1-8. 1937. 3. Harris, J. Arthur. Mathematics in the Service of Agronomy. Jo. Am. Soc. Agron., 20:^3-1^. 1928. h. Criticism of the Limitations of the Statistical Method. Jour. Am. Soc. Agron., 22:263-269. 1930. ... 5. Love, H. H. The Importance of the probable Error Concept in the Interpretation of Experimental Results. Jo. Am. Soc. Agron., 15:217. 192J. 6. Neyman, J. Lectures and Conferences on Mathematical Statistics. Graduate School, U.S.D.A. 1937. 7. Salmon, S. C. Why We Believe. Jo. Am. Soc. Agron., 21:854-859. 1929. 8. . The Statistical Method:' A Reply. Jo. Am. Soc. Agron., 22:270-271. 1930. -■ - ••■ 9. Tippett, Li H. C. The Methods of Statistics. Williams and Norgate. 2nd Ed. pp. 19-^2. 1937. 10. Treloar, A. E. An Outline, of Biometric Analysis. Burgess Publishing Co. pp. 4-20. 1935. 11. Weld, L. D. Theory of Errors and Least Squares. Macmillan pp. I-30. 1916. 12. Yule, G. U. An Introduction to the Theory of Statistics. 9th Ed. pp. 211 -213 . 1929. Questions for Discussion 1. Explain why there is no such thing as exact measurement in quantitative data. 2. Distinguish between errors and blunders in measurements. 3. Define statistics. Why are some typical statistical constants? 4. In what branches of science have modern statistical methods been most extensively used? Why? 5. Define these terms: variable, variate, frequency, population, and sample. 6. What is the mean? How does it differ from the so-called weighted mean? 7. What is a frequency distribution or frequency table? 3. What is a class interval? How would you determine it for an array of data? 9. What is the difference between a histogram and a frequency polygon? 10. Give 3 measures of central tendency and distinguish between them. 50 11. What is a normal curve? Skew curve? Bimodal curve? 12. Distinguish between the binomial, normal, and Poisson distributions. 13. Define the standard deviation of the sample. Population. Ik. What is the best estimate of the standard deviation, of the population aa. obtained from the sample? 15 . Prove that '-■d 2 = /5x2 /Sx\ 2 N V N \ N/ 16. What is the coefficient of variability? When is it correctly used? 17. What is meant by an arbitrary origin? Where can it be taken? Why? 18. Explain Sheppard's Correction and the reason for- its use. 19. What are some of the specific things that statistical methods are expected to do when properly applied to data? 20. What are some of the difficulties likely to be encountered in applying statistical methods to field experiments? 21. What factors contribute to the reliability of statistical constants? 22. Is the evidence afforded by statistical analysis of data negative or positive? Explain. 23. Why have statistical methods been only partially used in agronomy? Name 3 men who have advocated such methods in this field. 2h . What are the principal arguments of that school of opinion which favors (or in- sists) on the application of modern statistical methods to field experiments? 2p. What are the principal arguments of those who do not favor the use of such methods? 26. What is generally indicated when "common sense" and interpretation based on sta- tistical methods do not agree' Problems In determining the moisture content of corn by the Brcwn-Duvall moisture tester, the common practice is to base the moisture percentage on the total or wet weight (corn plus moisture) of the corn. The moisture content of hay, however, is often expressed as a percentage of the dry weight of the hay. (a) A variety of corn produced I.7.2 lbs. of shelled corn that contained 1^.0 per cent moisture on a 12 -hill plot. The hills were 3x3 feet apart. Calculate the yield of shelled corn in bushels per acre on a 15". 5 'P e ~f cent moisture basis. (b) A twentieth acre plot of hay produced .120 pounds of field cured hay. Samples taken when the hay was weighed showed that it contained 20 per cent moisture. Express the true yield in tens per acre on a 15 per cent moisture basis. Head counts were made on a number of fields in a township as follows: FIELD NO. HEAD S COUNTED NO. HEADS SMUTT ED 1 50 " ^ ............ ... 2 1000 1 3 100 1 h 500 . 15 5 i+oo 20 6 300 h 7 1000 10 8 600 12 9 200 6 1Q 10000 __50 What percent smut may be expected in the wheat delivered to the elevator from this township? 51 3. These data were taken from several fields to determine the probable losses from smut for a community: FIELD PCT. SMUTTED (X) SIZE OF FIELD (f) (Uo.) (Heads) (Acres) 1 1.0 100 2 15.0 20 3 0.5 2:io k 20.0 10 5 0.0 500 6 o.s 500 7 3.0 50 3 2.5 125 9 0.1 225 10 5.0 150 What percent smut may be expected? k. Seme Iowa data were collected to determine the relation of certain ear characters in corn. The yields from the very short ears, when used for seed, were as fol- lows: Y ear No . ears Used Y leld-Bu. per acre 191? 2l|- 1*2 .70 1918 ; 2 .. 26.33 Determine the average yield for the very shoit ears for the 3-year period. 5. The table that follows gives the heights of plants of buckwheat in a study of variation at Cornell University. Plot the frequency curve on cross-section paper, Height in Centimeters 25 35 ^5 55 65 75 85 95 105 115 125 135 1^5 155 Number of Plants 2 2 3 5 10 12 60 99 ikk 85 65 18 2 1 Total 508 Does this seem to approximate a normal curve? What can be said as to the posi- tion of the mean in a normal curve? The mode? The median? Is a normal curve symmetrical? Is a symmetrical curve necessarily normal? 6. The table that follows gives the average yield of wheat per plant in certain studies at Cornell University. Plot the frequency curve a3 in the previous example . IK 52 YIE LD PER PLANT N UMBER PLANTS ( grams ) 0.5 57 1.5 59 2.5 88 5.5 kl 1+.5 ^5 5.5 29 6.5 26 7.5 5 8.5 8 9.5 6 10.5 ' 8 11.5 ' 5 12.5 5 13-5 1 1U.5 1 15.5 2 16.5 2 17.5 g_ Total 366 In what respect does it differ from, that of the previous example. What name is given to frequency curves of this kind? Do the mean, median, and mode coincide in this curve'? 7. The number of stalks were measured on two different kinds of Colsess barley plants grown in 1930 at the Colorado Experiment Station. One kind was a normal green (AcAc) and the other heterozygous for a lethal factor (Acac) . Plot the frequency curves. Does the lethal seem to be detrimental to growth? No. Stalks Heterozygous Plants Green Plants per Plant (Frequency) (Frequency) 1 5 7 2 11+ 9 3 51 28 k 62 33 5 63 31 6 1+1 19 7 21 12 8 12 k 9 5 1 10 1 11 1 12 13 1 Totals 27"S~ lk-j (Note: Calculate the frequencies of the green- plants on a basis of N = 276 in order to make the two sets of data readily comparable.) Some data were collected by Emerson (1913) for the study of size inheritance in corn. Classify the data for hybrid 60 x r jk . Prepare a frequency table for these data and calculate the mean of the sample using a guess mean. Continue and find the standard deviation (s') ana the coefficient of variability. The measure- ments are given as lengths of ears in centimeters: 53 Hybrid 60 x jk 15 13 10 12 13 10 13 15' 11 10 10 13 15 12 13 Ik lk lk 11 10 13 12 11 12 11 12 10 13 Ik 12 11 11 Ik 10 9 10 11 13 13 1^ 12 11 10 Ik 11 13 12 13 13 10 11 12 12 11 13 12 10 13 12 10 11 13 Ik 13 12 15 1*+ 12 13 9. Calculate the standard deviations (s v ) for height of plants in problem p, using (a) deviations from a true mean and (h) deviations from a guess mean. 10. Some 1930 data on black hulless "barley plants were compiled by the Colorado Experiment Station to determine the variation in number of kernels per plant. The data are grouped in classes, (a) List the class boundaries, and calculate the mean, standard deviation, and coefficient of variability, (b) Apply Sheppard's correction to the standard deviation, (c) Is the number of group classes sufficient according to Yule's formula? Calculate. x (class center) 15 ^5 75 105 135 165 195 225 255 285 315 f (frequency) 2 12 11 26 38 26 18 13 lj 3 1 = I63 Note that the origin is taken at the class center below 15 . CHAPTER VI TESTS OF SIGNIFICANCE I. Sta tistics as a Basi s for G eneralizatio n So far, the discussion has dealt with, a sample and its statistical description. The investigator may desire to apply the information collected from the samples to de- scribe the general population,. Before he can do that, he must take into considera- tion the chance or random errors introduced in the actual taking of the sample. Chance errors result from the operation of a great many factors, none of which is dominant, and all of which are relatively similar, equal, and independent. When only chance errors operate, the data are said to be random and follow the law of gr'eat numbers. Two kinds of error exist, chance and systematic. Errors due to chance may not "be entirely eliminated but can be submitted to' mathematical treatment. Systematic errors can be largely eliminated when an experiment is properly planned. II. Theory of P robability In the analysis of chalice errors, it is necessary to introduce some of the fundamen- tal concepts of mathematical probability. (a) Single Probabilitie s The probability of the occurrence of an event can be defined from two view- points . (1) Mathematical Probability: The mathematical or a pri ori probability of an event is the ratio of the number of ways the event may occur to the total number of ways it may either occur or fail to occur, assuming all such ways are equally likely. Thus, the probability of drawing any individual card from an ordinary deck is l/52, while that of drawing any card of a given suit is 13/i>2 or l/U. Probabilities are sometimes stated in terms of odds, e.g.; suppose the probability of the occurrence of an event is l/2p. The odds are 1:24 in favor of its occurrence, or 24:1 against its occurrence. To be more explicit, the occurrence of the event is expected just once in 25 trials. (2) Statistical Probability: Suppose an experiment is repeated a great number of times. When it terminates in a particular manner a certain number of times, the ratio of this latter number to the total number of trials defines an estimate of the probability of the particular termination. Suppose N- and N represent the number of successes and the number of trials (both successes and failures), respectively, then Limit _N_}_ will be defined as the probability of a success, Thus this probability can be approached but never attained in practi- cal work with infinite populations. The permanency of the value N /N for N large is the law of great numbers. This permanency results from randomness in the experimen- tal trials and is the necessary property that statistical data must possess to admit valid treatment by mathematics. As an illustration. of statistical probability, in a frequency distribution, any particular class frequency divided by the total number of observations in the distribution gives an estimate of the probability that any indi- vidual observation made at random will fall in that particular class. It is evident from either definition that the probability of the occurrence of an event may vary between zero (0), i.e., certainty that the event will not happen, and 55 one (1), i.e., certainty that the event will happen. (b) Several 'Probabilities When several probabilities are to be dealt with simultaneously, it becomes necessary to consider two fundamental theorems. (1) Theorem IY When a number of mutually exclusive events have certain probabilities of occurrence, the probability of occurrence of some one or other of these events, is the sum' of their individual probabilities. For example, tho "probability that an observation in the barley yield data (Chapter 3, pages 39 and ^0) will fall in class x = 88 is 12 /lj-00, while the probability that one will fall in class x = 107 is 3lA 00 » The probability that an observation will fall in either class 88 or 107 is 12/U00 ■*■ 3lA00 = U3/U0O, i.e., P = 0.11. (2) Theorem II: When a number of independent events have certain probabilities of occurrence, the probability of all occurring together is the product of thoir individ- ual probabilities. In the above example, the probability that the first and second observations will fall in classes x = 88 and x = 107, respectively is 12/k-OO times 31A00 = 372/160,000, i.e., P = 0.0023. A -- Large Sample Theory III. Probability and the Normal Curve Statistical data that possess the property of randomness often are distributed in a manner closely expressed by a normal distribution. Many of the sample statistics of large samples can be mathematically proved to have distributions extremely close to normal. Therefore, the application of probability to the normal curve is important in practical work. The area below the curve is taken as one unit . Hence, the area between any two ordinates may be considered as the probability that an individual observation will fall within the range defined by the two ordinates. Wow, by theorem I, tho probability that an individual observation will fall within any range, is the sum of the probabilities that it will fall in all sub-divisions of that range. Thus, for characters which are distributed normally, it is possible to estimate the probabilities of their occurrence in any given range. This is done by finding .the areas beneath the normal probability curve t,hat correspond to the given range. Math- ematical tables of such areas, called probability integral tables, have been con- structed. (See Table I in appendix). Some of the most important probabilities and ranges are given below with the aid of a figure. t = -3a (* = - -2cx -lo-^-P.Ey ^P.|.4-lcr +'dcr 0.67^5 and -<- 0.67^3, respectively) 56 In the case of a normally distributed variable, it is clear that: the probability that an individual observation will fall within a range of a on either side of the mean (x) is approximately 0.68; within a range of 2 er it is approximately 0.95; while with- in a range of 3 cr it is 0.997- Thus, the probability that the observation will dif- fer from x by at least 2 <r is 1.00 -. 0.95 or about 0.05. In other words, the chances that an individual will fall outside a range of 2 o - are approximately 5 : 95 or 1:19- This means that such a situation may be expected about once in 20 times due to chance alone. In like manner, the probability that the observation will differ from x by at least 5 cr is 1.000 - 0.997 or 0.003. Such a result, then, may be expected to happen only once in 333 times. Therefore, when an observation differs from the mean by "too much" there arises the important question as to whether or not this abnormal re- sult might not be due to some special cause acting in the case of this individual. When some special affecting condition is known to exist, common sense leads one to the conclusion that the extreme abnormality of the observation is more likely due to the affecting condition than to be expected on the basis of probability. TV. levels of Significance What constitutes an abnormality which is "too much" is a matter of arbitrary decision. Common usage in this country considers an abnormality of twice the standard deviation (standard error in this sense) as being sufficient to warrant the statement that the abnormality of difference from the mean is a re al or significant differ ence. 1 This does not mean that an individual observation taker, at random and showing a signifi~ cant difference does not belong to the general population. However, in such a case one would inquire as to whether the individual case in question was of a special nature, either inherently or by reason of treatment. Should such a condition be sub- stantiated, it is quite proper to attribute the abnormality to special cause or con- dition and not to chance. Some workers in the field of statistics use a difference of 3 cr as a criterion for a significant difference, This allows the worker to place more confidence in a con- clusion derived from a "significant" observation, but this advantage is over-shadowed by a possible tremendous loss cf information due to the imposition of a too stringent criterion. V. Different Kinds of P ro bability Tables ' There are two kinds of probability tables, viz., one-way and two-way tables. The use of a particular one depends upon the nature of the statistical hypothesis to be test- ed. The results obtained in one can be readily explained in terms of the other (See Livermore, 193 1 ' ) • (a) One-Way Tables The principal one-way table for normal curve areas is that devised by Sheppard and published by Karl Pearson (191*4-) as Table II. Suppose an ordinate is erected at a distance on the positive side of the mean, exactly twice the standard deviation (.cr) , Thus t or d/a = 2. From Table I (appendix), it is found that the area (A) that corresponds to t (or d/a) = 2 is 0.9772, or the area defined by the interval from minus infinity to the assigned value of t (d/a). Thus, with the total area beneath the curve considered as 1.0, the area' to the left of the ordinate is 0.9772 while that to the right is 1.0000 - O.0772 = 0.0228. Thus, P = 0.0228 (about l/kk) is the probability that a value taken at random, will exceed the mean ( in one direct ion only) by an amount equal to 2 or more times the standard deviation (a). l-This approximates 3 times the probable error. 57 Sometimes probabilities are expressed as odds: Area inside the ordinates divided "by area outside the or di nates is equal to the odds against the occurrence of a deviation as great or greater than the designed one due to chance alone. In the above example, 0. 9772/0.022 8 = kjil (approximately) In this case the odds are k$:l that a value will not exceed the mean to the extent of two or more times the standard deviation due to chance alone. Table I (appendix) is a one --way table. (b) Two-way Tables Suppose one inquires as to the probability of selecting a variate at random so that it shall fall outside the limits of plus or minus twice the standard devia- tion. Two ordinates are erected, one at t or d/a = -2 and one at t = + 2. The problem is to find the area in both tails of the curve. This will be (1.0000 - 0.9772) times 2 = 0.0k^6, or double that in the one-way table. This means that the probability that a single variate selected at random will deviate by an amount equal to or, greater than + 2a is O.O^b, or approximately l/22. The values on a two-way basis can be expressed as odds as follows: 0.954^/0.0^5^ = 21:1. Thus, the odds are 21:1 against the occurrence of a deviation as' great or greater than the designated one (plus or minus twice the standard devia- tion) due to chance alone. A typical two-way table for large samples is Table IV given by Davenport (193&). In summary, it should be clear that the one-way interpretation or the use of a one-way table gives the probability or odds that an obtained value shows a certain discrepancy from the mean in a stated direction whereas the two-way interpretation does not state the direction which the discrepancy must take in a statement of proba- bility or odds . ( c ) Transformation of Values - ■ The probability values obtained in one type of table. can be readily trans- formed into terms of the other to. meet the experimental argument at hand. Probabili- ty values in a one-way table can.be doubled to give the results obtained from a two- way table, and vice versa. The transformation of odds is as follows: Odds in two-way table = odds in one-way table - 1 2 Odds in one-way table = I (odds in two-way table) (2) 4- 1. I ' VI * Standard Errors of Sta tistical Constants Each statistical constant or estimate has its own standard error. The standard error of a statistic derived from a sample is the standard deviation of the distri- bution of that statistic thought of as resulting from many samples. The distribu- tions of many statistics are nearly normal, particularly when the basic sample is large. (a) Standard E rror of a Single Observat ion The "best"- 1 - estimate of the standard deviation of a single observation (cr) is the standard error (s) derived from the sample. Some data on the total weight of ^-Note: The best unbiased estimate is simply called •"best''. See more advanced treat ments of mathematical statistics. 58 grain in grams for -non -competitive Colsess barley plants as follows: Class Center 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 Frequency 3 11 21 35 kj 55 7I 52 k'J 35 21 10 9 11 5 1 1 N = ta x = 13.8 Sfd 2 = l^lG.bh s = standard error of a single observation = /Sf i c (l) ■V W-l In the above example, it would be calculated as follows : = / ^ j j- _~y- - 6.003 grams This value, s = 6.003 grams, is the standard error of a single Variate in this sam- ple. For instance, the value of the mean, 13.8'+ 2 (6,003) indicates that the odds are 21:1 that a single individual taken at random will not deviate from the mean by more man 2 a in either direction, where the normal distribution of the population affording the sample data is assumed. NV (b) S tandar d Error of t he Mean Suppose a second sample were taken. One could hardly expect to get exactly the same result for the mean (x) as in the sample in question. Thus, the mean (x) obtained from a single sample is merely an estimate of the true mean (m) of the whole population. The latter is unknown and necessarily must remain so. In case it were possible and practical to take and analyze a greater number of samples, finding the mean (x) for each, one would expect the mean of all the sample means to be very close, indeed, to the mean of the population (m) . Since this is not feasible, one can only ask how good an estimate of the population mean (m) is the mean (x) computed from a single sample. The answer to this question can only be given in terms of probabili- ties. It can be shown mathematically that the mean computed from a large number of large samples are distributed nearly normally with standard deviation, ox, , which is theoretically equal to the ratio of the standard deviation of the population to •JW, the number of observations that make up the sample*? However, the standard devia- tion of the population (o) is unknown and in its stead its estimate (s) derived from the sample is used. Therefore the standard deviation of the hypothetical distribu- tion of means of a large number of samples will be estimated as follows: c- = standard error of the mean ~ s • (2) x /¥ The greater the number of observations in the sample, the smaller will be the stand- ard errors of the various statistical, constants. Hence, the statistical constants derived from a large sample are more likely to represent the true constants of the general population than those derived from a small sample. When the sample is small, the argument is the same except that the distribution for x deviates from normality and needs special interpretation. V^It should be noted that s = /Sfd , which best estimates the standard deviation -V¥-i ' _. of the population, closely approximates s' = /Sfd- which is the standard deviation V N of the sample, and is the estimate of c - given by the maximum likelihood principle. villi g is sometimes expressed as ''S »E . of the mean." 59 In the above example the standard error of the mean {°%) is: o~x = _S_ - 6.005 = O.296 grams. -/n -Jku The mean is I3.8 + 0.296 grams. Therefore, the odds are 21:1 that x (I3.8 grams) dees not differ from the unknown true mean (m) of the general population by more than 2o-j, or 2(0.296) a 0.592 grams. (c) Standard Error of the Standard Dev i atio n Next, it is desired to discover how reliably the standard deviation of a single sample (s 1 ) estimates the unknown standard deviation of the population (cr). Mathematically, it has been found that the best estimate of a hypothetical distribu- tion of standard deviations derived from a great number of samples is as follows: = standard error of standard deviation = _j3_ (approximately) (3) o" f~ — y2N From the example used above, a cr = zM= " 6 - QQ 3 - b 0.209 grams Therefore, the odds are 20:1 that s = 6. 003 grams does not differ from the unknown true standard deviation of the general population (c*) by more than 2 a or 2 ( 0.209) = 0.4l8 grams. (d) Standard Error of the Coefficient of Variability By use of the same argument, the standard error of the coefficient of varia- bility is: °" = C. V. ^1+2 / C.VA 2 2 when C. V. is - (k) C ' V * /~2N ■- \ 100 ' J greater than 10 °>. «• = c ' V » vnen c - v - is le s s than 10 - ( F c .v . . . - -/2N In the example used., a- = *3-5 c.v. -M2fi4.nl 1 +2 fk^2 100 2 = I.78I A table has been worked out by Brown (193*0 to shorten the computation necessary to secure the standard error of the coefficient of variability when C.V. is greater than 10. (e) Standard Error of an Average of Averages The standard error of an average of averages is given by the formula: °a » 1 /*J"% * °% ^-- *\ ■ •••< 6 > where N equals the number of separate means and a £ a 5^, etc., represent their separate standard errors. c - (f ) Standard Error of a Difference Suppose two samples are measured with respect to a common character. From the data, let two similar statistical constants be compared, e.g., the two means or 6o the two standard deviations. The question arises as to whether the two constants, differ significantly. Its answer depends upon the standard, error of the difference which is as follows: 1 °a = V (j2 l + °" 2 2 - / a ~i * eC 2 (7) where s-j and So are the standard errors of the two like statistical constants derived from the two samples. Where a significant difference results In the case of two sam- ples drawn from the same population, it would Indicate probable improper sampling technique leading to lack of randomness.'- The principal' use of this method lies In its test as to whether or not a fa/ctor known to exist in the case of one sample, and not in the other, is really a causal factor to which an -abnormal difference can he attributed, e.g., the difference between two yields in a yield triad. For example, suppose it Is desired to determine the standard errors of the difference of the mean yields of Kanred and Turkey wheats, and also for Manchuria and Minnesota hk^ barleys. (1) Wheat Variety Yield (Bu.J (2) Barley V ariet y Yiel d (Bu .) Kanred 25 + 0.7 Manchuria 38*9 + 0*9 Turkey 2k + 0.6 Minnesota kk-5 1+8.5 + 1-2 Difference 1 -4- 0.92 Difference 9.6 + 1.5 l7 d = 7(°-T) 2 + (0.6) 2 = 0.92 o rl = VTo.9) 2 + (I-?-) 2 - 1..5 VII. Significant D ifferences After a difference Is obtained for two statistical constants, as in the- above example, it is desirable to test this difference for statistical significance. An investigator may arbitrarily choose whatever level of significance he desires, but should state the level chosen. He must use care In attributing differences to causal factors when the differences approach the level of significance that he has chosen. To determine the significance of two statistical constants, their difference divided by the standard error of the difference (ctcl)Is commonly employed. For example, in the case of Kan- red and Turkey wheats cited above, t = d_ = JL_ = 1.09 og O.92 When the level of significance la taken as d/c& - 2, this difference is not signifi- cant. On the basis of probability, the odd.s are a little more than 2:1 that this difference is a real or significant difference. Hence, it may be ascribed to chance, or to put it in another way, one would not claim superiority for Kanred because the probability is too large that such a statement is incorrect. In a comparison of Manchuria and Minnesota hk^ barley, t ■ =jl = _9._6 * 6 .-'-!- Hlhen Oj = op, a d = qJ2, .... ^Relation (7) holds strictly only where the two variable statistics are normally dis- tributed and derived from uncorrelated data. 61 Since t = d/oft is far greater than 2, the difference in yield between the two varie- ties is said to be real and not due to chance . In this case the claim is made that Minnesota Mt-5 is superior to Manchuria in yield ability. The probability of the incorrectness of this statement is insignificantly small. It would be a miracle if this claim were really incorrect. VIII. Probable Errors . The quantity O.6745o~, which gives the range that contains half the observations, is termed the probable error. For example, an average yield of 15.0 +1.5 bushels would mean that the chances are 50:50 that the true value of the average for an infinite population lies between 13.5 and I6.5 bushels. It also indicates that the chances are even that it may lie outside this range. (a) Use of Probable Errors Historically, the probable error was used before the standard error. It is still widely used in this country in the statistical treatment of biological data, but the tendency is to use the standard error and think in terms of it. It is gen- erally felt that "the probable error is an unmitigated nuisance," and has nothing to recommend except its previous usage. (b) Formulae for Probable Error s The probable error is approximately two-thirds of the standard error. It can be obtained when each standard error value is multiplied by 0.67^5. These formulae may be briefly summarized: (1) P.E. single determination = + O.67U5 (8) (2) P.E.- = 0.67^5 8' or + 0. 67*15 s (9) -/ N - 1 /r (3) P.E._ = 0.67^5 s' (10) cr -J 2N (h) P.E. = + Q.67^5 C .V. c.v. - L_ y V2N 1+2 jC.V.) 2 1 100 (11) where C.V. is greater than 10. (5) P.E. C#V> = + 0.6745 C.V. (12) ■J 2N where C.V, is less than 10. (c) Levels of Significance for Proba ble Errors The level for significance for the probable error is commonly taken as D./P.E.^ = 3' This is equivalent to odds of about 22:1. Some workers use 3-2 times the probable error, for which the odds are approximately 30:1. A table of odds for probable errors is given by Hayes and Garber (1927) in "Breeding Crop Plants." (d) Relation of Standard Errors to Probable E r rors Based on the normal curve the quart ile lines, Q2 and Qu, error of a single variate, or Q = O.67J+5 cr. give the probable 62 The intervals £ +1 o* x + 2 a X + p CT include >8.3 ? of variates 9 E >.5 QQ.7 " " 99-7 x + l P.E. x + 2 P.E. i + 3 P.E. Include 50.0 'jo of var later, 95.7 " " B -- Special Case of Small Samples IX . Use of Small Samples in Biolog ical Research The methods heretofore explained relate to the determination of significance based on the normal distribution for large samples, but it is not always possible to obtain large samples. This is often the case in agricultural or biological experiments. When the investigator can be certain that the populations which afford, small samples approximate the normal distribution in form., he may feel that the interpretation of the statistical analysis was valid. Therefore, the materia], that follows is given on the basis of small populations whose distributions approach that of the normal curve. Statistical treatment of small samples, from populations far from normal in distri- bution, may probably be inadequate. Too often it may lead to Incorrect conclusions. Statistical analysis of a single sample with less than 20 cases is hazardous. In samples of 20 to 100 cases, the hear -normality of the under-lying population should be known. This places a severe limitation "on the use of small samples, but fortunate- ly in agricultural and biological experiments, most of the populations with which the experimenter deals, are near normal. The importance of the small sample, together with its statistical treatment, has been discussed by Fisher (193 1 !-)- X. Degrees of Freedom The reliability of a statistic (estimate of a population parameter) will obviously depend upon the number of variates in the sample. This dependence is also affected by the number of restrictions placed on the aggregate observations in the determina- tion of an estimate of a population parameter. The total number of observations diminished by the number of restrictions which they in aggregate must submit to has been termed "degrees of freedom" by Fisher (193*0 • It has been stated that the best estimate of the variance of a population a3 derived from the sample is as fellows: S(x - x) 2 (13) In this case, the number of individual observations (N) Is diminished by one to give the degrees of freedom. The number of statistical constants of the sample which are directly used in the computation arc subtracted. The mean or total fixes one value in the above formula, so that only IJ-1 observations are free to vary. This Is of little importance when a large sample Is analyzed, but very important in small sam- ples. XI . Probability Determinations with S mall Samples The distribution of x I _§_ is not sufficiently close to normal for small samples. The nature of the distribution of x/s' was found by "Student" in I90S. He prep ared a series of tables based on the distribution of s' (whore s' = V S ( x - S^/N) which he designated as "Z" . He showed that the "Z" distribution, now more commonly called 63 Student's distribution, was the same as the Pearson Type III curve. More recently, he has prepared tables for the distribution of "t" which is designated as x/oj or x ->/n7 8 by Fisher (193*0 • For a given value of "t" that corresponds to a given number of degrees of freedom one can read the probability in an analogous manner to the -way the tables of areas of the normal curve are used. The "t" table devised by Fisher (193*0 is a two-way table. A probability of 0.05 is Fisher's 5 per cent point for which the odds are 19:1; a probability of 0.01 is the one per cent point for which the odds are 99:1 . In addition several one-way tables are in use. These are as fellows: (1) Student's "t' ! table, (2) Livermore's modifi- cation of Student's "t", (3) Student's "Z", and (k) Love's modification of "Z". For example, suppose t = U.60^ for k degrees of freedom. The probability as found in a one-way table is equal to 0.995. The calculated odds would be 199:1. They are cal- culated as follows: 1-P = 1.000 - 0.995 * 0-QP5. 5/1000 = l/200. P = 1/200 is equivalent to odds of 199:1- XII. Significance of Means When d is the difference between the mean of the sample and any value (m 1 ) assumed to be the mean (m) of the population, it has been stated that the difference, d = x - m', is significant when d/a x exceeds 2. When this occurs, the hypothesis (m .-= m') is rejected. This procedure holds when d/o g is nearly normally distributed as in large samples. As this distribution is not close to normal for small samples, the "t" table should be used in such cases. When the 5 per cent point is used as the level of sig- nificance, a value of t = d /<j£ that corresponds to P = 0.05 is considered as signifi- cant. In this test for the significance of the mean one determines the probability of drawing a sample with a msan equal to x from a population whose true mean (m) is assumed to be some particular value (m 1 ). XIII. Means of Two Indepen dent Sa mp les One of the most important problems in statistics is to test the significance of a difference between two means, i.e., 2, - x 2 = d. Previously, it has been stated that the s tandard error of the difference of the means of two samples is o^ = rX, * °xo* Should- there be any reason to suspect that the standard deviations of the two underlying populations are different, one should form t/og_ with o^ as given h^re. (a) Samples with Different Numbers of Obs ervations When it can be assumed that the standard deviations of the populations are the same, or that the samples have been drawn from the same population, then the best estimate (s) of the population standard deviation (cr) is: s = / D ( Xt - x x x ) 2 * S(x 2 - *2)2 - - (13+) (1^ - 1) + (N 2 - 1) Here N^ and N 2 are the numbers of observations in the two samples while the denomina- tor evidently denotes the degrees of freedom. This method to determine s as an esti- mate of o* is particularly important in the case of small samples. 6k The "t" value, equivalent to d/s^ is calculated as follows t \/ = xi - xo r~ ~ — i :i_ ; xm -i i^o V NfTSa (15) (b) Samples with Same Number of Observations The above formulae are simplified when the number of observations are the same in each sample, i.e. N^ = No. The standard error (single observation) is as follows: /■ 3 (x-j - x 1 ) 2 + s(x ? - x 2 ) 2 ------- _ -._-__.._ _ _ _ (16) -V 2- (If - 1) The value of "t" is as follows: i ^2 ------ -K I , Some data presented by Imiuer (19J6) may be used to illustrate. the computation. Sin- gle plots of Velvet and Glabron barley were grown side by side in single plots on 12 different farms. The yields in bushels per acre are given below: Farm No. Glabron (x-j ) Velvet (xo) Sum 1 '+9 2 1+7 3 39 k 37 5 hG 6 52 7 51 . • 8 . 57 9 *+? 10 1+5 11 1*8 12 6'+ . 1*2 91 J 4-7 oil- 38 77 ^2 69 111 87 lid 93 ] '5 06 5 ... 113 1*2 87 39 34 ^7 95 59 105 S (x 1 ) = 58O S(x 2 ) = 509 IO69 *1 r " ^8.3333 x 2 = 1*2 "Ja67 x ^ ! ^5.3750 S( Xl 2 ) - 28,620 S(Xg) = 21,979 100,269 (Sx r )2 = 28,033.31 '" (Sx 2 ) 2 =; 21,590.10 V This can bo readily proved as follows r'-.T X]_ - x 2 = x.^ - x 2 = x-i - X2 = x-j - Xg I N-^Ng 2 • 2 2 2 ; s I N - - So „ WoS-, - IT, So BL No - Nn -, I 1X 1 % W 2 I *1* 2 |/X W 2 where "3" is an estimate derived ~o'j pooling the two samples, based on the hypothesis that the two populations have a common standard deviation (o). 6 5 The computations are as follows: . [S ( Xl 2 ) - (Sx^/Nj + [s (x 2 2) - (SX2) 2 /N 2 ] 2 (N-l) = (28,620.00 - 28,055.51) + (21,979.00 - 21,590.10) 22 = 586.69 + 588.90 = 975-59 ■ ^.5^50 22 22 s = Jkk.3k50 = 6.6592 t = x x - x 2 [W = ^8.55 - te.te /if = 2#1?69 V2 6.6592 -V2 The "t" table is entered for t = 2.1769 for 2 (n-l) = 22 degrees of freedom. P lies "between 0.05 and 0.02. It may. he concluded that the odds are in excess of 19:1; that the difference "between the mean yield of these two varieties is not due to chance. XIV. Means of Paired Samples In this case, the variahles are paired, i.e., each value of x^ is associated in some logical way with a corresponding value of x 2 . As a result, there will he the same number of variates in the two samples. When there are II pairs there will he N-l degrees of freedom available for the comparison. This is widely known as Student's Pairing Method. (a) Student ' s Pairing Method This method is devised to compare two results on a probability basis. Ii, is used primarily for small samples it not being necessary to assume a normal population. Partial mathematical proof of the method was first published by Student (V.S .Gossett) in 1908. Differences between paired values are dealt with directly, with the result that the correlation between paired values is taken into account. The method was. brought to the attention of American agronomists in 1925 by Love, et al. (1925* 19 2) +). The variance (s 2 ) and "t" values are calculated as follows: - - (18) s2 = variance = S(d 2 ) - (Sd)2/N / . N - 1 % -a / HF - (19) Here "t" is used to test an obtained, value, d, in accordance with the hypothesis that the mean of the population of differences is zero. A significant result would mean the rejection of the hypothesis and would warrant a statement that the mean of one of the basic populations exceeded that of the other. (b) Method of Computation The method of computation can be illustrated from the Glabron vs Velvet barley yields mentioned above. The computation follows: 66 Farm No. Glabron (xi ) Velvet (xp) ■. d (Velvet 'from Glabron) 1 k9 k2 2 »47 lt-7 3 3^ 33 k • 37 32 5 h6 hi 6 52 |H 7 51 *'.' ^5 8 57 56 9 ^5 ! +2 10 1+5 59 11 >+8 k'j 12 64 5V bum t> (x x ) = 530 S(xp) = 509 Q 7 1 5 R s 11 6 1 3 1 25 s(a) = 71 a - 5.9167 Mean % = ! R>.3333 3Eg > } 42.>+l67 (d 2 ) = 929 (Sdj 2 /n = ^20.0857 s2 = S(d 2 ) - (Sd) 2 /p * 929.OOOO - U20.0857 = V6. 263+9 .N - 1 11 t = C W /jE" = 5.9167 / lh' 6.26^ ,= 5.9167. / /5T8555 = 3.0133 The value of "t" ia then looked up in the t -table (Fisher, 1930 ) for 11 degrees of freedom (ll-l paired values) where it is found that the observed value lies "between 1 = 0.02 and P = 0.01. The "Z" table devised by Student is sometimes , .!.sed. He designed "Z" as the ratio of the mean difference to the standard deviation of the mean difference, i.e., Z = Jc where s ' = /S_( x-x) 2 B 1 V dJ ' Student (1926) calls attention to the fact that the "Z" table should be enter 3d with N-i degrees of freedom. As mentioned previously, his "Z" table is a one-way table. The Z-value can be transformed to "t" as follows: ( c ) Application of the Pairing Method The application of "this method is highly desirable for making comparisons between pairs of varieties or treatments when the scope of the experiment is limited to a few pairs of observations. It is useful for simple tests such as nested vs. untreated where- only two or three things are being compared. In plot work, the method can only be used to remove soil heterogeneity where the plots are physically paired, i.e., adjacent . References' 1. Brown, Hubert M. Tables for Calculating the Standard Error and the Probable Error of the Coefficient of Variability. Jour. Am. See. Agron., 26:65-69. J-93 1 ^ • 2. Davenport., 0. B., and Ekas, M. P. Statistical Methods in Biology, Medicine, and Psychology. John Wiley and Sons . pp. 35- ) +0, and pp. 166-172. 1936. 3. Fisher, R. A. Statistical Methods for F'" search Workers (5th edition). Oliver and Boyd, pp. 112-125. 193<+. 6 ? k. Goulden, C. H. Methods of Statistical Analysis. Burgess Publ. Co., pp. 9-H; and 20-26. 1937. 5. Hayes, H. K., and Garber, R. J. Breeding Crop Plants. McGraw-Hill, p. h2 & 86-92. 1927. 6. Immer, F. R. Manual of Applied Statistics. University of Minnesota. 193^- 7. Livermore, J. R. The Interrelations of Various Probability Tables and a Modifi- cation of Student's Probability Table for the Argument "t". Jour. Am. Soc. Agron., 26:665-673. I93U. 8. Love, E. H. The Importance of the Probable Error Concept in the Interpretation of Experimental Results. Jour. Am. Soc. Agron., 15:217-225. 1923. 9. Love, H. H., and Bruno on, A. M. Student's Method. Jour. Am. Soc. Agron., Io:60. 192^ . 10. Love, H. H. A Modification of Student's Table for Use in Interpreting Experimen- tal Results. Jour. Am. Soc. Agron., 16:68-73. 192^+ . 11. Pearson, Karl. Tables for Statisticians and Biometricians. Part I. Cambridge U. Pres3. pp. 2-8 (Table II). IQlU. 12. Student. The, Probable Error of the Mean. Biometrika 6:1-25. I908. 13. • New Tables for Testing the Significance of Observations. Metron, 5:18-21. 1925. Ik. Mathematics and Agronomy . Jour. Am. Soc. Agron., 18:703-720. 1926. 15. Tippett, L. H. C. The Methods of Statistics. Williams and Norgate. (2nd edi- tion), pp. 110-121. 1937. Questions for Discussion 1. What is the basis for using statistics for generalization? 2. Distinguish between a prio ri and statistical probability. 3. Give two basic theorems where several probabilities are involved. k. What is the geometrical significance of the standard error? Its significan.ee in practical problems? 5. Why is a difference said to be statistically significant when it is two or more times the standard error? 6. Is it correct to say that standard error is a measure of experimental error? Explain. 7. What is the difference between a one-way and two-way table in the calculation of probability? Interpret probabilities calculated from each kind of a table. 8. How do odds differ in on^-way and two-way tables? 9. How can odds be transferred from a one-way 'to a two-way basis? Explain the dif- ference in interpretation. 10. Explain the difference between the standard error of a single observation and the standard error of the mean. Give the formula for each. 11. What is the formula for the standard error of an average of an average? Standard error of a difference? 12. What is the relation of the standard error to the probable error? Why do most statisticians prefer to use standard error? 13. Why are special methods used for small samples? lb. What is meant by "degrees of freedom"? 15. Who was "Student"? What were some of his contributions to statistics? 16. What is the meaning of Fisher's "t"? 17. What was Student's "Z"? How can it be transformed to "t"? 18. How is the standard error calculated for the means of two independent small samples drawn from populations with equal standard deviations? 19. What is Student's pairing method? How does it differ from other methods of cal- culating standard errors? 20. Under what conditions can Student's pairing method be used? What are some of its limitations? 68 PROBLEMS 1. (a) If the mean of a population is 21.65, and o~ -■ 3'21\> determine the probabili- ty that a variate taken at random will he greater than 28.55 or less than 1V.75. (h) Determine d/tr f or P -= 0.01, 0.05, and 0.;30. 2. Suppose the odds in a 1-way table are 87:1. Transform them to a 2 -way basis. 3. In a wheat variety test, yields in bushels per acre were as follows: Karired : 54.6, 53.7, 68.0, 55,2, 58.5, 62.1, 56.7 64, .2, 57.5. = 53.3 Cheyenne: 66.3, 60.9, 64.3, 67.6, 63.8, 62.2, 63.4 60.6, 67.2, 55.3, x = 64.3 Calculate: (a) The standard error for a single plot (s), and the standard error of the mean (a g) for each variety; (b) The standard error of the difference between the two varieties (07^); and (c) Determine whether or not the difference between the varieties is statistically significant. Assume the population stand- ard deviations are different. 4. The yields of two varieties in bushels per acre are as follows for several repli- cations: Variety A: 58. 40,40,42,39,35,32/26,42, and 44. Variety B: 37,J>7,40,40,32, 30, and 31. Compute a pooled estimate of the standard error (s) of the two varieties, compute t, and determine whether or not the varieties differ significantly in yield by reference to Table II in the appendix. 3- Two varieties of small grain, Big Four and Great northern, were grown each year in adjacent plots from 1912 to 1020. The yields are given below. Yields in Bushels per Acre Year Great Northern Bis Four 1912 1913 191.4 1913 1916 1017 1918 1919 1920 71.0 73-9 48.9 78.9 43.5 4 ( . V 63.O 48.4 43.1 54.7 60.6 45.1 71.0 40.9 45.4 53-4 41.2 44.8 Which varieties yield higher? Is this difference significant • (a) means of two independent samples? (b) Paired Samples? shown by: 69 6. The grain yields in grama per plot for spring wheat irrigated at the tillering and jointing stages were as follows for 1921 to 1923 (incl.): Year Plot Tillering Jointing 1921 A B C D 155 232 2^3 257 281 202 271 265 1922 A B C D 1+59 332 3to 312 366 )+o8 396 366 1923 A B C D 513 5oi 563 3U6 602 635 593 539 3 -Year Average 360 too Determine whether or not irrigation at tillering results in a significantly higher yield than irrigation at the jointing stage. Consider the values paired. CHAPTER VII THE BINOMIAL DISTRIBUTION AMD ITS APPLICATIONS I . The Binomial Distribution Suppose that "p" is the probability that an event will occur in one trial, and "a" the probability of failure of that event to occur. Then., it can be shown by means of the two theorems on probability that the successive terras of the binomial expansion will give the respective probabilities that, in "n" trials, this event will occur exactly N, N - 1. N - 2. ..... or times. The binomial expansion is as follows: (p 4- q )N - p N + jj . pN-lq ... H(N-l) 1.2 <l ,q N h- N(N-l)(N-2 )p N -^ + .,..,. 1.2.3 where evidently p + q. = ] Then, the probability of exactly X -occurrences in N trials is; X I N - X! where NI =1.2.3 N. This expansion is called the Bernoulli series or distribution. When p -- q_, the binomial distribution is symmetrical. This distribution is similar to the normal • distribution for large values of N but it is unsuited for continuous variables be- cause the distribution itself is discontinuous. (a) Eei ation to Probability Suppose a die is thrown 20 times. In this case, the 21 terms of the expan- sion (l/6 + 5/6) will give the various probabilities of a particular face, say six, appearing 20, 19, 18, 17 ....... or times. Now suppose that the problem is more complicated. Let h dice be thrown 20 times and the sixes counted that appear on each throw. In any one throw the prob- abilities of getting i+,3,2,1 or sixes are given by the terms of (1/6 + ~)jo) L ~' r , To secure the most probable results of the experiment, multiply each of these probabili- ties t-r 20. The probability for sixes, P/M = 1/1296 times 20 = 0.016 p/,\ = 20/1296 times 20 ---- O.308 P/ 2 ) = 150/1296 times 20 => 2.J10 ? (1} = 5OO/1.296 times 20 = 7.7IO ?f \ = 625/1296 times 20 - 9. 61+0 Then, the most probable outcome of the experiment is: No sixes, 10 times; 1 six, 8 times, 2 sixes, 2 times; J sixes, times; and h sixes, times. ( b ) Constants of th Bino mial Distribution The formulas for the more important constants of the binomial distribution are as follows : -70- 71 Mean number of occurrences, x = Np-------------(i) Variance, o^ = Npq --------------------- (2) Standard error, a = JWpq ------------------(3) Probable error, P.E. = * 0.67^5 ^/Npa ------------ (k) Mean proportion of occurrence, p = Nipn + N 2 p 2 ------- (5) N x + N 2 Variance of proportion of occurrences, a*- - pq ------ - (6) N~ Standard error of proportion of occurrences, o~ = /pq - - - (7) II. Applications of the Binomial Distribution The Binomial distribution may have a variety of uses in comparisons of observed data with an a priori hypothesis or the comparisons of two samples. (a) Comparison of Observatio ns against an a Priori Hypothesis . Suppose in a sample of N -trials of an experiment the number of occurrences of a given phenomenon is x. Let it be desired to test this result in accordance vith an accepted standard outcome of such experiments, the expected proportion of occurrences being p. The expected number of occurrences (X = Up), and the discrepancy will be the numerical value of x - Np = d. Then the probability that corresponds to t = d/V = d/VNpq. may be found vith the aid of the t -table when N is small, or with the table of normal curve areas when N is large. It would be equivalent to test the proportion (x/n) against the expected proportion (p) by the formation of t = d/cr = (x/n) - p /Si V N A very common application is the comparison of observed data for monohybrid Mendelian ratios with the theoretical. (See III below). (b) Comparisons of Samp l es from Different Pop ulations It may be desirable to compare the proportion of occurrences in two samples from admittedly different populations. Then the samples provide the following infor- mation: Sample I Sample II N l Number of cases or trials N 2 x;l Number of occurrences of a x 2 given phenomenon p-j_ = X]_ Proportion of occurrences ]g>= x 2 N x Ng" o~2 = p. q. Variance of the proportion cr^ = P2I2 ~N7~ N 2 The differences in proportions = d = p-i - pp. The standard error of the difference = 0& = J^^ + °o 2 Pl^l + P2<l2 n x iT~ 72 Thus, t = d = pi- 112 ?]_q.]_ - p 2 ^2 N x "No" ( c ) C omparison of Samples from Same Po pu lations The difference "between this problem and that in (To) above consists in the hypothesis that but a single population is being considered. As a result, the data afforded by both samples are combined to give estimates of p and a for the population. Thus, the estimated proportion of occurrences will be: p = NiP^ + N2P2, an( '- ^ ie N-, + N. HO estimated standard error of the population will be s e= mq_ , where q_ = 1-p. 7 IT i - ; " H 2 Then t = p-i - p 2 may be interpreted as in previous cases . s III . Standard Er rors of Men del i an Ratio s In the analysis of genetic data, it is necessary to test the significance of the ob- served with the calculated counts obtained when certain theoretical conditions are postulated. With monohybrid ratios, the general practice is to use the binomial dis- tribution, which is sometimes referred to as the probable error of a proposition. The ratios which may be calculated in this work by the binomial distribution are: 1:1, 3:1, 9:7, 13:3, 15:1, 63:1, and 27:37. (a) F ormula for M endel ian Eatios The standard error of a Mendel ian ratio is: o- = Vp (1-P)N or 751 - '- (8) where K = the number of individuals, p = the proportion of one group as a decimal fraction, and 1 - p = the proportion of the other group as a decimal fraction, (l - p - q.) . Some writers use the formula, S.E. =yp.q.N, where "p" and "q_" represent the proportions in decimal fractions. ( " b ) U se of Method The binomial method can be used in genetic data only when two phenotypic classes are grouped, other methods being used for three or more classes. In the Fv generation of a barley cross, 200 green and 72 white seedlings were counted. It is desired to test these data for a 3:1 ratio. Gree n White Total (i Q Observed numbers 200 72 272 Calculated 3:1 ra tio 2 ok 63 07? Deviation '+ To obtain the calculated number for a 3:1 ratio, divide the total number observed by the combined possible number of classes which is h in this case, e.g., 272 /k = 68. This gives the calculated value for the white (or l) class. For the green (or 3) class, multiply 63 by 3. This gives 204 a" = VpTi-p) n = Vo.75 x 0.25 x'272 * 7,]J+6o Next., the deviation divided by the standard error is computed: d/o- = '1/7.1^60 = O.56 183 161 Jkk 258 66 3kh 75 3.0356 9.33 193-5 150.5 3kk 10.5 9.2068 1.14 T3 The observed ratio fits the calculated 3>1 ratio very well, indicating that a simple Mendel ian factor pair is responsible for the production of green and white seedlings It is to "be noted that d/o is less than 2, which indicates that the fluctuation of the observed ratio from the calculated may be considered as due to chance. In any event, there is no reason to reject the theoretical ratio hypothesis. Another example may be given for green and white barley seedlings. Green White Total & 2 &/<7, Observed Calculated 3>1 ratio Calculated 9*7 ratio It is apparent that the data do not fit a 3:1 ratio as shown by the high value of d/cr. However, they fit a 9* 7 ratio very well, indicating that there are two factor pairs involved in the production of green vs. white seedlings in this cross. (c) Short -Cut Tables for Computations Tables published by Cornell University give the Probable Errors for Values of N from 11 to 1000. Another set of tables occurs in "Mendel ian Inheritance in Wheat and Barley Crosses," by Kezer and Boyack (1918). The probable error values obtained from such tables can be converted to standard errors by the division of the probable error value by the factor, 0.67^5. IV. The Poisson Distribution as A Special Case As a rather special case of the binomial distribution, there is an approximation of what is known as a Poisson distribution. This occurs when p, the probability of the occurrence of an event, is very small and N, the number of trials, is very large so that Np becomes appreciable. In a Poisson distribution the probability, Pv-, of exactly x occurrences in IT (IT = very largo) trials, is given by: P x m e -(Np) x i Where e is a constant (2.718) and p is the probability of occurrence in a single trial, and x\ = 1.2.3 x. Although there are tables published of those probabilities, their use is unnecessary in the more coirmion types of application. (a) Constants of the Poisson Distribution For the Poisson distribution, the moan and variance are equal. Mean = x = Np Variance = cr^ = Np, so that a = -/Np (b) Use of Poisson Distribution The Poisson distribution gives a basis for the solution of many problems that involve the maintenance of certain standards. Suppose that registered seed regulations state that red clover seed must not contain over a given percentage of noxious weed seeds in order to gain certification. Suppose that from a lot of seed, a sample is taken of such size that a count of 10 noxious weed seeds corresponds to the allowable percentage. In this case, the mean x = Np = 10. The standard error, a = -/Np = VlO = 3-1- However, 18 weeds may have occurred in the sample analyzed. The whole lot is rejected for registration because the deviation from the mean, 18 - 10 = 8, exceeds twice the standard error, i.e., 2 6 - 6,2. Suppose that lb noxious weed seeds are counted in a sample from another lot-. Now a. decision "becomes questionable. Suppose that a second sample is taken and Ik weed seeds counted.. How consider the two samples as one. The mean., x - ftp = 20, and the standard error,- cr =-/Wp =V20 = k. c ). Thus, ik + 16 = 30 which differs from the moan "by 10. How-; ever, this lot would be rejected "because the deviation from the mean, 10, exceeds 2 cr= 2(1*.. 5) = 9.0. Reference! 1. .Anonymous. Tables of Probable Error of Mendelian Ratios. Department of Plant Breeding, Cornell University ' (mimeographed) . 2. Fisher, R. A. Statistical Methods for Research Workers "(5th edition), pp. 55-72, 19;A. 3. Kezer, Alvin, and Boyack, B. Mendelian Inheritance in Wheat and Barley Crosses. Colorado Exp. Sta. Bui. 2k$ . 1918, - ./; • h. Miles, S. R. A Very Rapid and Easy Method of Testing the Reliability of an aver- age and a Discussion of the Normal and Binomial .Methods. 5. Robertson, D. W. The Effect of a Lethal in the Heterozygous Condition on Barley Development. Colorado Exp. Sta. Tech. Bui. 1. 1932. 6. Sinnott, E. W., and Dunn, L.. C. Principles of Genetics, , McGraw-Hill, pp. 371 -375 ■ 1932. 7. Tippett, L. H. C. The Methods of Statistics. Williams and Nor gate, pp. 30-33- 1951. Questions for Discussion 1. Give the binomial expansion of (p ■<- q) . 2. What type of distribution is the binomial distribution? What are its limitations? 3- What is the genetic application of the binomial distribution? Its limitations? k. How does the Pois^on distribution differ from the binomial distribution? 3. Under what conditions might the Poisson distribution urove useful?" Problems . 1. Colsess, a white-glumed barley was crossed with Nigrinudum, a black-glumed barley. The segregation in the Fo was 785 black-glumed plants and 215 white-glumed plants. What ratio best fits these data? Calculate a/a. 2. The Eg segregation of a Colsess (hooded) by Minnesota 90-8 (awnod) cross gave 229 hooded plants and 89 awnod plants. Determine the ratio that bsst fits these data, and test its fit. 3. In a cross between Colsess II and Colsess III, 183 green seedlings and 161 white seedlings were observed in the Eg. Determine the ratio that best fits these data and test the fit . CHAPTER VIII THE X 2 TESTS FOR GOODNESS OF FIT AM> FOR IMPENDENCE I. ThcX 2 Test So far, statistics like the sample mean (£) and the standard deviation (s>) have been used to express differences between distributions, either an observed against a hypothetical distribution, or one observed distribution against another. However, in such cases the general form of the distribution (normal, binomial, Poisson) has been assumed and comparisons have been limited to values of parameters of the distribution. The use of moments such as these might be adequate for an accurate comparison of dis- tributions were a sufficient number of higher moments employed. However, this method has the principal disadvantage of being tedious as well as involving questions as to the validity of the sampling errors of higher moments. Many times it is desired to compare or te3t observed data with those expected on the basis of some hypothesis. This has been referred to as a test for "goodness of fit." Again, individuals may be measured or classified categorically with respect to two separate characters or conditions. It may be desired to test these characters for association. Both of these general problems can be attacked by use of a statistic known as X 2 (Chi-squared) calculated from the data afforded by the sample. II. The X2 Distribution The X 2 test, to measure "goodness of fit" of observed results to those expected, was advanced by Karl Pearson in 1900. (a) Formula for X 2 The theoretical distribution must be adjusted to give the same total frequen- cy as the observed. Then, when is the number observed in any one group or category of the experimental distribution, and C the theoretically calculated number for the same group, based on the hypothesis that the data follow some certain distribution, the formula f or X 2 is as follows: 2 X = S J (1) (0 - c) 2 1 c where "S" is the summation extended over all the groups or classes. It is obvious that the more closely the observed number agrees with the calculated the smaller X 2 will be. Further, all differences in frequency (0-C) are squared, whether positive or negative. Thus, X 2 is always a positive quantity, its size being clearly dependent on the number of groups into which the distribution is separated and degree of agree- ment between the several values of "0" and the corresponding values of "C". There- fore, in the ordinary application of X 2 , the number of degrees of freedom will be the number of groups diminished by the number of restrictions imposed on the theoretical distribution that supplies the values of C . When the only restriction imposed is that the total frequencies of the observed and theoretical distributions shall be equal, the degrees of freedom are one less than the number of groups. In other words, where the frequencies are determined for all groups but one, the frequency of that one is automatically determined by subtraction from the total. (b) Sampling Distribution The sampling distribution of X 2 has been worked out so that it is possible to find the probability (P) of obtaining from a hypothetical population with a given distribution, a sample that shows a distributional variation from that of the popula- tion which would result in a X 2 value as large or larger than that exhibited by the sample in hand. For every value of X 2 ; in conjunction with any optional number of -75- 76 degrees of freedom P = 1.00 f or X 2 = and, as "X5 increases, P diminishes. Sirce the mathematical relationships between X 2 and P are complex, it is necessary to have tables that relate P, X 2 , and the number of degrees of freedom, for practical use. ( c ) Grouping Da ta It is unwise to group too finely or to apply this test where the data are so insufficient that, for certain of the groups, the expected frequency is small. This condition very easily might cause that part of a 2 contributed by such groups to un- duly affect the total X 2 , This is obvious from the mathematical form, (0 - C) r -/C ; where C is small. Fisher (193*0 recommends that each group should contain at least five individuals for the test to apply. Sometimes the tail groups with very low frequencies should be combined. 111 • Prob ability Tables for X 2 o _ As has been stated the probabilities for X c - values are obtained from tables . m 'order to use them, it is first necessary to know :, n", the number of degrees of free- dom in which the observed series may differ from the hypothetical. It Is equal to the number of classes, the frequencies in which may be filled arbitrarily. When only the totals, have been made equal, n = n' - 1, where n.' is the total number of classes or groups. In contingency tables, where tests for independence are being made, the number of degrees of freedom is the product of rows and columns minus one in each case (r - I) (c - 1) because the hypothetical and observed classifications are forced to conform both for row and column totals. To quote Tippett (1931): ''Suppose, in an extreme case, there are n' groups and we fitted a curve Involving n' constants which were calculated from the data.; then the two distributions would agree exactly and X^ would be zero because sampling errors would have had no play," The importance of degrees of freedom in looking up the probabilities that correspond toX2 has been emphasized by Fisher (1922, 1923, 193*0 • (a) Elder-ton "Table of Goo dness of Fit" A table was prepared by Elderfon with the values of "?" (probability) that a deviation as great as or greater than the observed. may be expected on the basis of random sampling. These values correspond to each integral valtIe>or ^ from Fto 30-. This table is available in "Tables for Statisticians and Biometricians" by Karl Pearson (191*+) • The user must be careful with bhis table because n' is equal to the number of degrees of freedom (n) plus one. The probability of Intermediate X.--'- values can be obtained approximately by interpolation.* (b) Fisher "Table of X 2 " More recently, Fisher (193*0 ^- aG published a table of X 2 which uses degrees of freedom (n) directly. It gives values of X 2 that correspond to special valv.es cf "?". Fisher (193*0) states: "In preparing this table we have borne in mind that, in practice, we do not want to know the exact value of 'P' for any observed X2, but in the first place, whether or not the observed value is open ; :o suspicion. If r P' is between 0.1 and 0.9 there is certainly no reason to suspect the hypothesis 'tested. *Note: For example, the probability forX 2 = J+.12 determined from h classes can be Interpolated as follows: WhonX 2 = k, P - 0.26lhbk k. 12 X 2 = % P ^ 0. 171797 Difference 0.12 O.OG9667 Product 0.12 x O.O89667 * O.OlO'JoO "?" value 0. 2611*64 - O.OIO760 = 0.25070^ 77 If it is below 0.02 it is strongly indicated that the hypothesis fails to account for the. whole of the facts. Ve shall not often go astray if we draw a conventional line at 0.05 and consider that higher values of X 2 indicate a real discrepancy." The table given by Fisher has values of "n" up to 30. Beyond this point it will he found sufficient to assume that -y^X 2 - y2n-l is distributed normally with unit standard deviation about zero. For example: X 2 = 35.62, n = 32, V2X 2 = 8.^,72n-l = 7.914-, Difference = 0.50. Thus, where -/2X 2 --/in-l is materially greater than 2, the value of X 2 is not in accordance with expectation. (c) Normal Probability Integral Table In the special case for one degree of freedom (n - l), the probability can be obtained from the table of the normal probability integral because X is normally distributed for one degree of freedom. (See Table II, "Tables for Statisticians and Biomecricians") For example, suppose it is desired to find the probability that corresponds to X 2 - 3. 200 X=/X"2" ^/J^OO n 1.7639 In the- table opposite t = 1.7889* the value of the probability that corresponds to it is found to be O.9632. The value of the probability for the one tail will then be 1.0000 - O.9632 = O.O368. On the basis of a 2 -tailed table it would be O.O368 x 2 = 0.0736. A -- Goodness of Fit IV. Uses of X 2 for Goodness of Fit The X 2 test for "goodness of fit" can be applied to data grouped into classes where it is desired to compare them with a theoretical or hypothetical ratio. The great advantage of this test for goodness of fit is that no limitations or conditions are imposed upon the form of the distribution under investigation. Historically, the X 2 test was first ured to test the goodness of fit of an observed frequency distribution to a normal distribution of the same total frequency, the same mean, and the same .standard deviation. It is still used effectively for this purpose when the number in the sample is large. One sacrifices a fit in the tails of the distribution by use of the X 2 test, but often the investigator is only interested in the central range which the data cover. The X? test is particularly useful in genetics to test Fg and later segregations where two or more phenotypic classes are involved. J. Arthur Harris (1912) first called attention to the value of the X 2 test for genetic data. V. Computation of X 2 for Goodness of Fit In Mendelian ratios from F2 progenies and later generations, the common practice is to summate the numbers in each phenotypic class and to formulate a hypothesis on the basis of the ratio obtained in order to establish the number of genetic factors in- volved. TheX 2 test is used to determine whether the deviations of the observed num- bers from the calculated numbers are not due to chance. (a) General Method of Computation • In a cross that involves two independently inherited Mendelian factor pairs, a 90:3:1 ratio is expected in the F2 generation. A segregation in the F2 generation of a barley cross that involved long vs. short-haired rachilla (Ss) and covered ve . naked seeds (Nn), gave results as follows: (Data from Bob ert son) 78 Long -Haired Baohilla Short -Haired. Rachil la Total Covered Seeds Naked Seed s Covered Seed. a , Naked Seeds 2061 6I+5 673 256 ' 3637 (SN) (Sn) (bN) . (sn) The calculated ratio for a 9:3:3:1 is calculated so that the total of the theoretical values equal the total in the sample,, i.e., 3637 • The value 3637 is divided hy 16 (Q +3 +3+1) to give the expected number in the class with short-haired rachillas with naked seeds, i.e., 3637/16 = 227. 3125'. The values on the basis of expectancy for the 3-classes can be computed "by multiplying 227.3125 by 3 = 681.9375; etc. The results can be put down as follows: Observed Calculated „ Classes Ratio No. (0) No. (C) - C (0 -. C) 2 (0 - C) d /C SN 9 206l 2014-5.81 I5.I9 23O.736I 0.1128 Sn 3 645 ' 681. 9k 36.9k x3.64.5636 2.0010 sN 3 675 68l,9'4- 6.9*4- kQ. 1636 0.0706 _sn_ 1 236 J227 ._1 _ _ 28 . 69 823. 116 1 __. 6211 Totals 3637 3637.OO X 2 = 5.3055 n = 3 P« 0,lcJ33 Hence } the deviations from the calculated ratio cannot be regarded as significant. (b) Method fo r Two Classes The X.2 value may be calculated directly where "A" is the number in one class, "a" the number in the other, and "N" is the total number in the sample (A + a) . These formulae are given by Immer (1936) and represent a transformation from the standard method for the computation of X.2 for goodness of fit. pati o A : a X- Va lue (A - a) 2 1:1 N - ..__...-_ (2) 3 : 1 (A _ _g_2 _ _ „ _ _ „ „ ........ - (3) 3N 9 : 7 (7A - 9a) 2 . _... (M 631^ m : n (nA - m a) 2 •___„___._„__..________„ - ( 5 ) mnil The computation may be illustrated with data which appear to fit a 5 : 1 ratio. A = 2903, a = 936, and N = 3839. "*- 2 = (A - 5e ) 2 = (2905 - 280o ) 2 = 0.7840. ? = Value close to 1. 3N 3 x 3339 ( c ) The~X2 Test Applied t o S ev era l. Q-e ne tic F ainil ies In genetic data, Kirk and Immer (1928) show that the total class frequencies obtained by summation are composite results which may easily mask a serious lack of consistency in numerical ratios of the separate families with respect to agreement with expectation. To summate the numbers in each class of all progenies is to rely on mean values and thereby disregard deviations from the ratio expected to occur in each family. This applies particularly where the numbers are small. The smaller the number in each progeny, the greater the opportunity to err when the summations are taken as an indication of the genetic constitution. In such cases, a goodnoss'of fit test like Xj2 ± B required which involves in its calculation deviations from ex- pectancy for each class of each progeny. It should be mentioned thatXA- values can- 79 not be averaged. However, they are additive provided the number of degrees of free- dom are properly taken into account. VI. Fit of Observed Data to- the Normal Curve The \2 criterion is useful to determine whether or not observed data give an accept- able fit to the normal curve or any other assumed form of distribution. It is useful where the sample is large and where the requirements f or X2 are fulfilled. First, the range of measures is divided into an arbitrary number of classes so as to meet the number of measures in the separate classes which a valid use of the X 2 criterion demands. Data on number of culms counted on 1+11 wheat plants at the Colorado Experi- ment Station are used to illustrate the computation. The data are as follows: X (Class center) 1 f (Frequency) 2 3 5 7 9 11 13 15 17 2^ 52 85 Ufc 69 36 18 6 19 21 23 25 27 1 2 1 1 = 1+11 2 = 8.9172 s' = 3.U715. s' (corrected for grouping) = 3.1+231. (1) The data are regrouped in order to have a larger number of cases in the tail classes. Classes less than 1+ 7 > 9 11 13 15 more than 16 Class Range 1+.0 to 6.0 6.0 to 8.0 8.0 toiao 10.0 to 12.0 12.0 toll+.O ll+.O to 16.0 2b 52 85 111+ 69 36 18 11 The class range is reduced to cr -units, viz., 2/3.1+231 = 0. 581+3 The correction to the mean above 8.0 = 0.9173/3.^231 = 0.2680 cr -units. II = 1+11 (2) The next step is to calculate the end points of units for the class intervals in ct -unit s . Unit Calculated Class rang e 1+.0 Area Range Frequency Frequency Less than - OO T,0 - 1.1+3 0.08 32.9 1+.0 to. 6.0 - 1.1+3 to - 0.85 0.12 1+9.3 6.0 to 8.0 - O.85 to - 0.27 0.19 78.1 ->8.0 to 10.0 - 0.27 to + 0.32 0.21+ 98.6 10.0 to 12.0 + O.32 to + O.90 0.19 78.1 12.0 to ll+.O + 0.90 to + 1.1+8 0.11 1+5.2 ll+.O to 16.0 + 1.1+8 to + 2.06 0.05 20.6 more than 16.0 + 2.06 to + 00 0.02 8.2 Total 1+11.0 The cr - value for the class that contains the mean is: 0.581+3 - 0.2680 = +O.3163 cr. This value is the ordinate for 10.0 while -0.27 is the cr -ordinate for 8.0, these values being within the range, 8.0 to 10.0. The other area ranges are calculated by the addition of O.58 to determine the next higher or lower range. For example, it is 0.32 + O.58 = + 0.90 for the range 12.0. (3) It is now necessary to compute the unit per cent frequency for each class by reference to a table of the probability integral, (Table I, appendix) . For exam- ple, the unit frequency for the range, -0.27 + O.32 is computed as follows: For t = -0.27, P = 0.61 - 0.50 = 0.11 t = +0.32, P = O.63 - 0.50 = 0.13 8o The frequency per cent for the distance ^ -0.27 to +0.32, is equal to 0.11 - 1 - 0.13= 0.24. This means that 24 per cent of the frequencies would "be within this range or the basis of the norma], curve. The other values can he calculated in a simi- lar manner , except that the two values are subtracted. The last class, from 2.06 to include the remainder of the curve, is computed as follows: For t = 2,00, P = 0.98 1.000 - O.98 a 0.02 (•4) The next step is to multiply the per cent frequencies "by the number in the sample (N) to obtain the calculated frequencies., e.g., (0.08) (411) = J2..9, etc. (5) The observed and calculated frequencies are now compared by use of the X> cri- terion. Observed Calculated Class range Frequency Frequency 0-C (0-C) 2 (0-C) 2 /C less than 4.0 2b 32.9 -6.9 47. 6l 1.4471 4.0 to 6.0 32 49.3 2.7 7.29 0.1479 6.0 to 8.0 85 '-73.1 6.9 47.61 O.6096 8.0 to 10.0 114 98.6 13.4 237.1.6 2.4033 10.0 to 12.0 00 78.I -9.1 82.81 . I.O603 12.0 to 14.0 36 43.2 -9.2 84.64 ■ I.8726 14.0 to io.o 18 20.6 -2.6 6.76 0.3282 mer e than 16. 11 _ 3 « 2 _ _2'2_ 7-e4 . O.o^ol Totals 411 411.0 • X 2 = 3.8271 P - O.II72 There are 8 classes, but only 3 degrees of freedom available because 3 constants have- been used in fitting the da.ta to the normal curve. lb is obvious in this case that the probability (P) is greater than 0.05 . Thus, the underlying distribution of the data may have been normal. This method, applies to fitting observed data to any hypothetical distribut ion . 'VTI. Part ition of X 2 into it a_ Co mpo nent s When a discrepancy in a theoretical genetic ratio on the basis of independent inheri- tance occurs, it may be produced either by linkage or a departure from the 3 : - ratios. Fisher (193*0 bias suggested a method whereby X 2 can be partitioned into its components to determine the source of the discrepancy. In a barley cross, the Y,p data were as follows for non-tipped and tipped lateral spikelets (Tt),.and for hoods and awns (Kk) : TK Tk tX tic Total " (a) ' ~ (b) ' "(c)" (d) Observed No. I496 315 550 216 2777 Calculated No. 1 562 .'06 ■ 520. 69 520. 60 173-^6 2 777-00 X.2 ^ 14.8855 P - very small To determine whether or not the discrepancy is due to linkage, the "X.2 value is par- titioned into its components as follows: x = non-tipped vs. tipped => (a + b) - 3( c + d) - (1J4.96 + 515.) - 5(350 4 216) = -287 y = hoods vs. awns = (a 4- c) ■- 3(1 + d) = (1496 + 550) - 3(513 + 21b) * -l'i-7 z = interaction or linkage = a -3b - 3c + $'d - 1496 - 3(515) - 5(530) -v 9(216) * +243 81 Thex2 values can be computed for each component as follows: 1. non-tipped vs. tipped: X 2 =jc£ = (287) : 3n 3(2777) 2 . hoods vs . awns : X 2 = y£ = (l^T) 2 . 3^" 3(2777) 3. interaction (or linkage): .-X? =_z?_= (243) 2 9n 9(2777) = 9.8870 = 2.5938 = 2.1+017 The data can he brought together in a summary form as below: Factor Pairs d.f . X 2 Non-tipped vs. tipped (Tt) Hooded vs. awned (Kk) Interaction 9.8870 2.5938 2.1+017 0.0016 0.107^ 0.1212 Totals 11+ .8825 very small Thus, the 3 • 1 ratio for non-tipped vs. tipped is found to account for a Large part of the high X2 value. There is no indication of linkage. B -- Test for Independence VIII . Independence and Association When observations have been classified in two ways, it may be desirable to determine whether or not the two variables are associated. The%2 test for independence has been used for this purpose. Two variables are said to be associated when the numbers in the cells of the contingency table are not randomly distributed. Contingency tables may be manifold, there being (r - 1) (c-1) degrees of freedom where there are "r" rows and "c" columns. In tests for independence^ .the subtotals of the classes into which the variates are distributed are used to determine the theoretical fre- quencies with the result that the subtotals, must be considered as constants in the determination of degrees of freedom. For example, the degrees of freedom in a 2 by 2 contingency table are one . The value of X2 is referred to a X 2 table to determine the value of "P" that corresponds to it for the number of degrees of freedom in the contingency table. A "P" value greater than 0.05 indicates lack of proof of associa- tion between two variables, i.e., they may be independent. The x2 criterion has proved useful as a test for the independence of two genetic factor pairs. "DC . Calculation of Independence or Associat ion '. The test for independence can be made when the data are compiled either in simple l+-fold (2 by 2) or manifold contingency tables. (a) The Manifold (m by n) Contingency Table The computation can be illustrated by some Fg data (Hayes) in an oat cross, Bond x D.C, where it was desired to learn whether or not there was any association between the reaction to stem rust and to crown rust. The data are: 82 Stem Rust Reaction Resis tant Susceptible Totals Ratio Crown Rust Reaction Resistant Susceptible Intermediate 50 (57. 2 W 119(112.1126) 2 5(24. 658O) 22(14.7550) 22(2,8.8950) 6(6.5500) 72 141 •51 0.2951 0.5779 0.1270 Totals 194 50 244 1.0000 In case that the amount of stem rust infection has no influence on the amount of crown rust infection,, the 244 observations would bo expected to be distributed at random in the 6 cells of the contingency table, with the restriction that they must add up to give the totals in the table (See Tippett, 1951., p. 69). The probability that an observation will fall in row No. 1 is "(2./2kh, and that it will fall in column No. 1 is 194/2.44 . Then, the probability that an observation will fall in the first cell Is (72/244) (194/244). The expected number of individuals in that square on the basis of independence is the probability multiplied by the total number, i.e., (72/244) (iqh/2kh) (244) = 57,2^94. The various steps in the computation are as follows: (1) The ratio of rows, for row No. 1, is 72/244 = 0,2951. (2)' The theoretical frequencies can be obtained by the multiplication of each of the ratios for rows by each of the subtotals for columns, e.g., 0.2951 times 194 = 57.2494 for cell No. 1. The other values are computed in a similar marine] In this case, it is necessary to compute the value for only one other cell, i.e., 0.5779 times 194 = 112.1126. The other values can be obtained by subtraction from the marginal totals . (3) The observed and theoretical values are then compared by use of the Xr criterion. 01 served Calculated Nc . No. 0-C (0-C) 2 (o-c) 2 /c 50 57.2494 7.2494 52.5538 0.9180 119 112.1126 6 . 8874 47 .if 363 0.4231 25 2k . 6580 0.5620 0.1310 0.0053 22 1.4.7550 7.2^50 52.4900 3.3574 22 28.8950 6.8950 47.54IO 1.6453 6 6.5500 0.3500 0.1225 0,0193 244 244.oooo X2 - 6.5684 n = (n - 1) (m - 1) = 2 P = O.O387 Thus, the indications are that there. is an association between the reactions to stem rust and to crown rust. (b) The 2 by 2 or *4-Fol d Table The 4-fold table Is often used to test tho independence of two genetic factor pairs. The independence of the two 3 : 1 ratios can be tested as follows: K. k Total V v a = 142 c ~ 49 b - 4j d = 15 a + b = 185 c •*- d = 64 Totals a + c •- 191 b + d = 58 N _ oiir y The value of x2 can be determined by the method outline in (a) above, or it can be •■coiaputed by a short-cut formula given by Fisher (193' 4 ) • ■2 _ U N (ad - be) 2 c) (b + d) (a + b) (c + d) 83 (6) ■ 249 Ha^)(g),,:.,(^)^9)3 U9lJ(58)(l85)(5IsT (2^9) (529) (191)(58)(185)(6U) 151,721 131,163,520 0.0010 whenX 2 = 0.0010, P = value close to 1. (c) Inadequacy of X 2 : Correction for Continuity When the several categories are represented by relatively small frequencies, the value of X 2 often gives inaccurate results because the corresponding probability of occurrence is too small. This is particularly the case in a 2- by -2 classifica- tion. Yates (193^) &as developed a correction that should be applied in such cases. This correction simply amounts to the reduction of each numerical value of each (0-C) determination by l/2. Thus, in the example above, the correction applied to "X 2 is as follows: "X (corrected) = N(ad - be - N/2) 2 (a + c)(b + d)(a + b)(c 4 d) 2k 9 C(3*2)(15) - (W(k9) - 2U9/gl g (7) (19D(58)(l85)(6l + ) (2k 9 ) (-IO6.5) 2 131,163,520 0.0215 - 2,82^,220.25 131,163,520 P - value close to 1 . In this case even tho the frequencies may be fairly large, it is quite proper to introduce the correction. However, the larger the number of categories, the less • important is the correction. X. The Null Hypothesis andX 2 It is important to understand something about the philosophical and logical bases -for the making of inferences from the X2 as veil as from other criteria for significance. The basic premise involved in every test for significance is a negative premise and has been termed the null hypothesis by Fisher (1937) . It is simply a tacit assump- tion of agreement, such as agreement between standard deviations of distributions, and agreement between distributions as a whole. In association and correlation studies the null hypothesis is construed to mean independence or lack of association between characters or conditions under investigation. This tacit negative premise can never be proved. For example, it is impossible to prove statistically that two samples came from the same population, or that the population which afforded the samples under comparison possess the same means or other statistics. It is impossible to prove statistically that two characters or conditions are independent or devoid of associa- tion. To draw such conclusions would simply be to reiterate what was originally only assumed to be true. Therefore, definite conclusions can be drawn only when the criteria for significance have been met, such conclusions being positive in nature . The investigator is able to prove differences to exist, association to be present, etc. In short, he is abl to prove the falsity of the null hypothesis but never its truth. 34 Reg erences - : ' > 1. Fisher , B. A. On the Interpretation of X 2 from Contingency Tables, and the Calculation of ?. Jour. Roy. Stat. Soc, Vol. LXXXV , Part I, 1922. 2. Statistical Tests of Agreement between Observation and Hypothesis. Economic a, 3:139-147. 1923. 3. Statistical Methods for Research Workers (5th Edition), pp. 80-111, and 274-275- 193^. 4 The Design of Experiments, pp. 1.8-20. 1937- 5. Goulden, C. H. Methods 'of Statistical Analysis, pp. 88-113 • 1939. 6. Harris, J. A. A Simple Test of Goodness of Fit of Mendellan Ratios, Amer, Nat., 46:74l. 1912. 7. Xminer, F. R. Applied Statistics Manual (mimeograi:>hed) .-. 1936. 8. Kirk, L. E., and Immer, F. R. Application of Goodness of Fit Tests to Mendelian CI as 3 Frequencies. Sci. Agr., 8: 7'-! •'>-' ■■">'■) . L923-Y- 9. Pearson, Karl, Tables for .Statisticians and Biometri clans, pp. 2-3 and 2.6-28. 191^. . •■: ■•■:■ 10. Tippett, L. H. C. The Methods of Statistics, pp. 63-88. 1931- 11. Yates, F. Jour. Roy. Stat. Soc, Suppl. I, Tio. 2. 1934. 12. Youden, W. J. Statistical Analysis of Seed Germination thru the Use of the Chi- Square Test. Cent rib. Boyce Thompson Inst., 4:219-232. 1932. 13. Yule, G. Udny. An Introduction to the Theory of Statistics (9th edition), PP. 370-378. I929. 14. Yule, G. Udny. Probability Values for One Degree of Freedom. Jour. Roy. Stat. Soc, 85:93-104. 1922. Questions for Discussion 1. What are the uses of the X 2 criterion? 2. What conditions must be fulfilled in the use of the X s test? 3. What Is the range of X 2 values? "P" values? 4. Give a rule for the number of degrees of freedom in a "goodness of fit" test. What is it for a contingency table? 5. What precautions are necessary in the grouping of data for a "goodness of fit" test? Why? 6. How do the Elderton and Fisher tables for X. 2 differ? What precautions are. necessary in the use of each? 7. Interpret "P" = 0.50 on the basis of goodness of fit. 8. In what special case can the normal probability integral table be used to compute "P"? Why? 9. Who is responsible for the X.2 test? For what was it first used? 10 . Explain how to compute X2 for goodness of fit. 11. What precautions are necessary in the application of the X 2 test for goodness of fit to genetic ratios? Why? 12. In the fitting of observed data to that expected on the basis of the normal curve, how many constants are used? Which ones? 13. Under what conditions may it be desirable to partition X 2 into, its components? 14. What does "P" = 0.01 indicate when obtained from a contingency table? 13. Explain how the -probability is calculated for a cell In a contingency table. 16. How does the X2. test for independence differ from that for goodness of fit? 17. What is meant by the null hypothesis?' . ,-, 85 PROBLEMS I. In a "barley cross, Robertson (1929) tested black vs. white glumes (Bb) and hoods vs. awns (Kk) for a 9 ' 3 '• 3 ' 1 ratio in the Fg. His data were as follows: Classes Observed No. Calculated No. Black hooded (BK) 2611 2656.7 Black awned (Bk) 920 885.5 White hooded (bK) 860 8P5.5 White awned (bk) 332 poc *. ■■----■■- 1 ■ ■ 1 1 , ,■ 1 1 I 1 1 . ■ ii 1 -' -■ * * r — . — Totals ^723 1+723 Calculate X 2 and interpret it.* Do these data fit a 9 '• 3 ' 3 : 1 ratio for independent inheritance? II. Some data on hoods and awns (Kk) and covered vs. naked (Nn) in barley were test- ed for a 9 : 3 : 3 : 1 ratio. The observed and calculated results were as follows: (Data from Robertson, 1929) Calculated No. ~2bTS 682 682 227 Totals 3637 3637 Apply theX2 test and interpret it. III. An Fg segregation of a barley cross, Colsess x Minnesota 8U-7, gave these re- sults: (Data from Robertson, 1929) Classes Observed No. Hooded covered (KW) 1969 Hooded naked (Kh) 631 Awned covered (kN) 737 Awned naked (kn) 250 Classes Obi served No. Hooded green (KF) 931 Hooded chlorina (Kf) 326 Awned green (kF) 326 Awned chlorina (kf) 119 (a) What ratio fits these dotal (b) Apply X 2 test and interpret it. (c) Calcu- late the probability both from the table by Fisher and from the table of Elderton. IV. In the F2 of a certain barley cross there were 2U9 plants with high fertility of the lateral spikelets and 67 with low fertility. Test these data for a 3:1 ratio by the X2 test for goodness of fit. V. A second generation segretation in a barley dihybrid for high and low fertility (Hh) and for black and white glume color (Bb) gave counts as follows: HB Hb hB hb 15^7 568 I+78 184 *Note: Statement for P: "A worse result might be expected on the basis of random sampling times in trials . " 86 When these data were tested for a calculated 9: 3- 5' 1 ratio, X 2 was 3.5718 with P = O.0365. Partition X 2 into its components and determine whether the discrepancy is due to the individual 7 j> : 1 ratios or to linkage. VI. Some Fg oat plants were classified on the hasis of crown rust and stem rust ■resistance as follows: Stem Rust Reaction Resi sta nt Susceptible Crown Resistant 66 hj> 109 Rust Susceptible 75 "" 2k 99 Reaction Intermediate 17 5 22 Totals 158 72; ' , 2.30 Use the X2 test f 02- independence to determine whether or not there is an asso- ciation "between the reaction to stem rust and crown rust . CHAPTER IX SIMPLE LINEAR CO.REELATIQ K I. Nature of Correlation So far, statistical analysis has dealt with a single set of observations to measure a single character. It is now desirable to consider two such sets of observations that measure two different characters. These observations are such that, to any- observation in one set, there is naturally paired a corresponding observation of the other. One naturally inquires as to whether there exists any association or connec- tion between the measured characters. Such association exists when an abnormality^- in one character tends to be accompanied by an abnormality in the other. The charac- ters are said to be correlated when such is the case. For example, height and weight in human beings are said to be correlated. In the aggregate, tall persons are heavier than short persons. To condense what nas been said into a precise definition, it may be stated that two characters are correlated when, to a selected set of values of one, there correspond sets cf values of the other whose means are functions of those selected values. II. Description of Correlat ion A graphical representation of the totality of paired observations can be obtained by the treatment of each pair of measurements as the rectangular coordinates of a point. Such a diagram of scattered points is called a scatter diagram. To illustrate, one may consider 20 pairs of observations that relate length (in inches) to weight (in ounces) of ears of corn: Length (x) Weight (y) Length (x) Weight (y) 2.5 2.5 3.0 k.Q k.5 5.0 5.5 6.0 6.0 6.5 3.5 3.0 5.0 7.0 5.5 8.0 8.0 10.0 T.o 10.5 6.5 7.5 8.0 8.0 8.0 8.5 9.0 9.0 9.5 10.5 b.5 10.0 8.0 10.0 12.0 13.0 12.0 1^.0 13.0 lh.O Mean length (x) = 6.5 inches. Mean weight (y) - 9.0 ounces From these pairs of measurements a scatter diagram can be made as follows: 1 Abnormality refers to - deviations from the mean, -87- QP d Ear length (:ln.) i+. 56 7 89 1.0 11 Ear weight (oz.) 5 6 I 3 9 10 11 12 13 1.1+ • • • • • * I 1 , • * J = 9-0 x -■- b . ' From the diagram, it is clear that the horizontal and vertical lines that represent the moan length and weight of the ears in the sample separate the plane, 'in which the points are plotted, into four regions or quadrants. It is also evident that most of the points fall into two of these regions, i.e., those which describe the abnormali- ties in regard to the characters to be of the same typo above the average and below the average. Thus, there appears to exist a direct or positive correlation between the characters . The totality of points that form the scatter very often possess the rough geometrical form of an ellipse. The position of the ellipse indicates the type of association, i.e., whether positive (direct) or negative (inverse). The shape of the ellinse roughly estimates the degree of correlation. The characters are closely related when the ellipse is narrow. A diagramatic representation of correlation is given in figures A, B ; and 0.1 ISome statisticians use the first quadrant in correlation analysis while others use the fourth. IB! 89 X 4- 1 ^ V 4- J 1 " ' \ + N. X. +\ y Figure A Low Correlation Figure B Positive Correlation Figure C Negative Correlation The signs for the quadrants are depicted in figure A. It is noted that the values of x above the mean (x) are positive, -while those "below the mean are negative. The same applies for the y values. The sign for the quadrant is the product of the corresponding marginal signs . There are two methods employed to describe correlation, i.e., the correlation surface method and the regression method. For an account of the correlation surface method, a text on mathematical statistics should be consulted. A -- The Correlation Coefficient III. Measurement of Correlation A precise mathematical measure of the degree of association between two characters is desirable. In any case, it must be based on an assumption in regard to the mathemati- cal functional relationship that exists between the variables. The most important measure is called the coefficient of correlation, symbolized as r. In the discussion that follows it is assumed that the association is linear, i.e., that the variables x and y are related by an equation, y = ax + b, where a and b are constants. Suppose one considers each pair of measurements of the two characters as an argument, either strong or weak, for one or the other of two opposite theories of association between the two characters. These theories are that the two characters are related, either positively or negatively. A linear relationship is said to exist between two characters when the moans of the values of one character are plotted with the selected values of the other character that correspond to them so mat the resulting points are well -fitted by a straight line. To measure the contribution of any given pair of measurements (x, y) to one theory of association or the otner, one measuresthe amount cf abnormality exhibited by the pair of measurements with respect to each character in units of the respective "standard deviation of the samples provided by the 2 sets of variatea, i.e., (x - x)/s'y. When these measures of the abnormalities of the pairs of observations are multiplied, i.e., (x - x)(y - y) " ' . ■-•- s i s i the result gives a numerical measure of the argument presented y by (x, y) toward a theory of correlation. The product of both •abnormalities will be positive when both -are- of the same type, either positive or negative. Their product will be negative when the abnormalities are opposite in type. A numerical measure of correlation between the characters under investigation is ob- tained when the procedure is repeated for every pair of measurements in the sample and the arithmetic mean of the several products is found. The formula for the correla tion coefficient (r) is as follows: 90 r - ' i- -s r*— * \ -/ y - y \ .... • t .s N I s' x / \ » y / - " * ' ' :"W It is obvious that "r" can "be plus or minus, thus depicting a positive or negative correlation. It will he shown later that "r" is numerically equal to or less than 1.0. Thus, the association that exists "between two characters may he strong, as evidenced hy a value of "r" numerically close to 1.0, or weak when "r" is close to 0. The above statement must not he construed too literally' hut in the light of sampling theory. IV. Computation of "r" for Ung r ouped Data The relation, r = _1 S •' x - x \ / y - y A may he transformed to many &iffe?reii v arbitrary forms for computation. Formulas which are useful for .email samples are as follows : r X = IJja O - N x y /(Sx* " Nx 2) ( Sy 2 I N =2) ' r = S(xy)/N - x y r - NS(xy) : (Sx)(&y ) ^ /[>3(x 2 ) - (Sx)2J [NS(y2) -(Sy) 2 ] (*0 Formula (3) is the one given hy J. Arthur Harris, which is direct, hut not suited so well to. machine calculation as (2) or (h) . The computation may he illustrated with these data pn the length of corn ears in centimeters and their weight in ounces. \^ (s'x) 2 » S* 2 - Nx 2 and (s' y ) 2 = Sy_ . SP „i£ . Nf c H N - . r = i P(x - x)(y - y) = ' i^ gxy _- xg.ix), -iS(xj ±JULX w s, x s 'v /lliE - sT) TIHZZjS 3 ) »' 1/fo" (Sxy - N x y - Ef;'x y + N x y } = Sxy - H x y l'/w V Sx 2 - Hx 2 ) (3y 2 - Hy 2 ) V fex 2 - Nx 2 ) (Sy 2 - Hy 2 ) 91 Length (x) vei^rt (y) X 2 y 2 xy 2.5 3.5 6.25 12.25 8.75 2.5 3.0 6.25 9.00 7.50 3.0 - 5.0 9.00 25.00 15.00 4.0 T.O 16.00 49.00 28.00 *.5 5-5 20.25 30.25 24.75 5.0 8.0 25.00 64.00 40.00 5.5 8.0 30.25 64.00 44.00 6.0 10.0 36.00 100.00 60.00 6.0 7.0 56.00 49.00 42.00 6.5 10.5 14-2.25 107.62 68.25 6.5 6.5 42.25 42.25 42.25 7.5 10.0 56.25 100.00 75.00 8.0 8.0 64. 00 64.00 64.00 8.0 10.0 64. 00 100.00 80.00 8.0 12.0 64.00 144.00 96.00 8.5 13.0 72.25 169.00 110.50 9.0 12.0 81.00 144.00 108.00 9.0 14.0 81.00 196.OO 126.00 9.5 13.0 90.25 169.OO 125.50 10.5 14.0 107.62 I96.OO 147 .00 S(x) =130.5 S(y) = 180.0 s( X 2) = 949.87 S(y2) = 1854.37 Sfor)= 1310.50 x = 6.5 y = 9-0 The symbols x and y are the means of the x and y arrays. The values, S(x^) and S(y2), are the squared values for each separate entry of x and y, respectively, and the summation of the same. The value, S (xy), is the summation of the product of each value of x "by the corresponding value of y. In practice, only the sums of the various values are recorded in machine calculation. The values may he substituted in (2) as follows: r = S(xy) - § x y = 1310.50 - (20) (6.5) (9.0) 7 (Sx2-Nx2)(sy2 - Uy2) . 7 (9^9-87 - 845. 00) (1834.37 - 1620.00) = 1310'50 - 1 170-00 - 140.50 = 0.937 J (104.87) ( 2 14 . 3 7 1 722480.9819 Those who use this icrmula for the computation of r are warned that a serious error may be introduced by dropping decimals. The means should be carried out to twice the number of decimal places as appear in the original data. The formulae given above are particularly valuable when K is ^•■•pII, i.e., less than 50. V. Calculation from a Ccrrelai:J.or Surfac e The correlation coefficient rauy bs calculated from a correlation surface with the deviations from the assumed means taken on an arbitrary scale. It is necessary to apply corrections for the means, standard deviations, and class intervals. Fisher (1934) has made a contribution, to simplicity in the mechanical computation of the correlation coefficient, his nethodl "being used "below. The data are for the correla- tion of total grain weight (x) in grams and culm length (y) in centimeters in wheat plants . ■*-In the determination of standard deviations where Sheppard's Correction has "been used, the uncorrected standard deviations should he used in computing r. 92 Table 1. Correlation Table for Grain Weight and .Culm Length in Wheat. Cud-m^ Gr. Wt. 9.5 29.5 ^9.5 69.5 89.3 109.5 129.3 1^9.5 169.5 189.5 209 . 5 length ^ (y) Y \. . . . -s. X 2 3 k 5 6 . 7 8 9 10 11 f .J 62 1 i 2 3 67 2 1+ 1 7 72 3 1 4 3 2 p 2 l)+ 77 k 2 8 1+ 2 2 18 82 5 2 7 h 7 8 5 1 5* 87 6 2 6 12 5 8 17 6 1 1+ 61 92 7 8 23 2 2 16' 20 5 2 83 97 8 7 21 2k l*+ 3 1 1+ 2 1 83 102 o 2 22 3h 13 1 1 8 6 1 88 107 10 1 3 32 26 6 T_ 2 71 112 11 ) + 15 1 • 26 117 12 5 1 6 f x 1 15 ^3 9o llif 90 61 35 21 12 6 I+.9I+ The data may be arranged as fellows Table 2. Computation of the Correlation Coefficient (1) (2) igth ( culms 3) (* ) (3) (6) (7) (8) (9) (10) (11 ) (12) (13) (Ik) Av . 1 61 (y) Total Prod- Ay. grain weight (x) Total Prod- for . uct for uct Class Wt. Class length Center Y % Yf Y 2 f y S 'Xf YS'Xf Center X f-V- Xf-V x 2 f x S'Yf XS 'Yf 62 1 3 3 3 10 10 9.5 1 1 1 1 3 3 67 2 7 ll+ 23 19 29.5 15 30 ■ . 60 ■ 51- 102 72 3 11+ 1+2 126 1+8 11+1+ 1+9.5 3 1+3 129 387 254 762 77 k 18 72 288 lh 296 69 . 5 1+ 96 38I+ 1536 696 2731+ 32 5 3^ 170 850 167 835 89.5 3 114 570 2850 963 1+815 37 6 61 366 2196 363 2178 109.5 6 90 5I+0 3240 770 1+620 92 1 83 581 1+067 [ i-93 3I+65 129.3 7 61 1+27 2989 1+66 3262 97 8 83 661+ 5312 1+33 3621+ 11+9.5 3 33 280 22^0 2 1+6 1968 102 9 88 792 7128. 500 1+500 169.5 9 21 189 I70I 178 1602 107 10 71 ■710 7100 1+Q2 1+020 189 . 5 10 12 12.0 1200 101+ loi+o 112 11 26 286 31I+6 161 1771 209.5 11 6 66 726 1+1 451 117 12 6 72 861+ 1+1+ 323 Totals h 9 k 3172.... 11108 27J6 21^0o_ 494 2756 1 69JO 3772 211+09 ,qy a-y£ P5YV q Tr --- - 2~ rty Y - 377f 49I+' - 7.63% X - 27J6 1+01+ 5.538^ The details of computation are explained as follow 1. To simplify the arithmetic,, the variables X and Y are used in place of x and y, respectively. They are related by: . , 93 X = (x - x x ) /C x + 1 Y = (y - y^/Cy * 1 where X]_ and y}_ represent the class centers of the first classes, and C x > Cy, are the class intervals of the x and y distributions, respectively. 2. The values for Yf y (in column k) are the products of the class values, Y, and their respective frequencies. The values for column 11 are computed in a similar manner, 3. The values for Y^fy (in column 5) are the products of columns 2 and h for the respective values of Y. The X^f x values in column 12 are computed from columns 9 and 11. h. The total deviations in culm length (y variable) are shown in column 13 for each column for grain weight (x variable). Here the symbol (f) without subscripts indicates the frequency of one cell of the correlation table, i.e., the frequency of a particular value of X accompanied by a particular value of Y. The symbol S' denotes the total over ,Just one array. It is necessary to refer to Table 1 (columns. 2 and 3) bo compute these values. 1st Y-array = (l)(3) =3 2nd Y-array = (l)(l) + (10(2) + 00(5) + WW * (2)<5) + (2)(6) = 51 3rd Y-array . (1)(2) + (3>(3) + WW + (5)(7) ♦ (6) (6) + (7)(8) + (8)(7) + (9)(2) + (10)(1) = 25^ etc. The values for the X-arrays in column 6 are computed in a similar manner. 5. For the product (XS'Yf) multiply each value of the total for length in column 13 by its respective X-value in column 9. For example, (3)(l) - 3, (51) (2) = 102, etc. The values in column 7 are computed similarly. It is noted that the ultimate result (SXY) of the computations carried out in columns (6) (7) and (l3)(lU) is the same. Thus, one provides a check on the other. 6. The computed values in Table 2 are then substituted in formula No, 2 above: \y r = S(XY) -NXY _ = 211+09 - (1*9*0 (7 -6356K5. 5585) V (SX2 - hx2)(sy2- NY2) j [3.6930 - (k$k) (5-5385) 2 J [31108-(i+9 J +)(7".6356F] = 21,1+0? - 26,891.16 , 3rjt& t 7(16,930 - 15,155.^5)(51,108 - 28,801.39) J 4,097,808.0 0"" = 517. 81+ / 202I+.30 = 0.2558 ^The data in the problem above have been coded. Suppose a = assumed mean, and C = class interval. It can be shown that the correlation coefficient from coded data is equal to that from the natural numbers, viz., r xy = r^y. x = (X - a x )/C x and y = (Y - a y )/C y x - x = C X (X - X), and s' x = C x s' x = 5(x - x)(y - y) = C x C y S (X - X)(Y - Y) e s 'x s 'y c x C y S 'X B, Y 9^ 7. The true means can "be computed from the above values as follows: y = (Y - 1)G 1 +Tft x (7.6356 - 1)(5) + 62 = 95.178O . ; x =» ■ (X - 1)C X + X ± ■ « .(5.5385, -1)(20) + 9.5 - 100.2700 71 . Use of the Corr elation Coe f ficient for Error of a Dif f erence The correlation coefficient may he used to reduce the standard error of a difference (o~cl) when there exists a correlation "between the paired values of two variables. This usually enables one to obtajnaignif icance with smaller differences than is pos- sible with the formula, o~ d = J a + b , given previously (See Chapter 6). However, it is seldom worthwhile to apply the correlation formula unless "r" is large because the reduction in error is usually insufficient to justify the greater amount of cal- culation. The extended' formula for the standard error of a difference is as follows: o" d = y a 2 + b 2 - 2 r al3 ab - - - - (5) In this formula a and b represent the standard errors of the separate values being compared, and r, the coefficient of correlation between the separate measurements of these quantities. The averages for the heading and blossoming stages of irrigation of spring wheat over a 9 -year period v may ■ be taken to show the value of the correlation coefficient in the reduction of the standard error. The average yields of grain in pounds per plot, together with their standard errors (e>-), are as follows: ' '■ Stage of Irrigation Year Heading B lossom ing The coefficient of correlation was calculated for the paired . annual yields by the use of the formula for the ungrouped data as explained in paragraph IV, viz., r = + 0.^06. 1921 ^78 ^52 h^o i h6 1922 776 * 57 637 * 31 1923 lllk ± 58 9U7 ± h-9 1921+ 1218 * 53 1189 i 52 1925 555 t 28 524 ■* 27 1926 llh * 59 6J+5 - k? 1927 l(&3 -59 1035 - 39 1928 639 * ^ 6lk ± 35 1929 895 -113 839 ±ro7 Mean 333 ± 19 762 ± 18 °7) » -V a 2 + b 2 " 2 ^a b ab = ^9) 2 + d/a d = 7l/20.r f 3.52. (18) 2 - (0.812)(19)(18) = 20.17 The standard error of the difference, calculated without the use of the correlation coefficient to reduce 'che error, was as follows: o- d = Va 2 + b 2 = -J(-l9) 2 + ( l8 ) 2 - 2 ^.1T d/o- d = 71/26.17 = 2.71 vY] ss first true class value. C Y , C v - class intervals. v^P.obertscn, D. W., et al . Studies on the Critical Period of Applying Water to wheat Data from Colorado Experiment Station, 95 It ie apparent how the test comparing the averages of the yearly means is strengthened "by taking into account the correlation due to years, VII. Significance of the Correlation Coefficient The test for significance is to determine the probability (P) that the observed correlation could have arisen by random sampling from a population in which the corre,. lation is zero. The t-test is more accurate for small samples while the standard error test is satisfactory for large samples. (a) The Standard Error Test In large samples drawn from a population in which the mean value of r is zero, the standard error of "r" is given by: a r = 1 -jg. (6) Vff - 1 From the standard error, r/o r is computed to determine significance. When r/or is • less than 2.0, the relation is probably due to chance rather than to correlation between the variables compared. Fisher (193*0 states that, in the use of the above test, the value of r itself introduces an error which is magnified when r is squared. Only in the case of large samples (greater than 100 pairs of observations) can the standard error test be used safely. Further, the distribution of r, at least for the stronger values, is so skewed that it is unwise to make any interpretation of differ- ence in terms of cr r based on probabilities related to the normal curve. (b) The t-test for Significance For small samples, the distribution of r is not sufficiently close to normal to justify the ordinary standard error test. Fisher (193*0 has developed the "t M ' test as a more accurate test for significance. Thin test measures the probability of obtaining a given value of r from a sample of paired values of a given size due to chance alone. A value of this probability of less than P = 0.05 indicates that the association of the characters is not due to chance, therefore being significant. The formula for "t" for a correlation coefficient is as follows: t = r Vl^"2~ (7) 7 1 - r 2 In this formula K = the number of pairs of observations. The degrees of freedom for the estimation of a correlation coefficient are N - 2 due to the fact that two statis- tics are calculated from the sample. The use of "t" may be illustrated with the correlation of ear length (x) and weight (y) in corn (Par. IV). t = rVN - 2 = 0.937 -/gQdL - = 11.38 V 1 - r2~ V 1 - (0.937)*' In the "t" table it is noted that for 18 degrees of freedom, the value of t required for P ■ 0.05 is 2.101. Thus, the above value is judged to be highly significant. The same result can be obtained from Table VA in Fisher, (193*0 . (°) Difference between Correlation Coefficien ts A test for the significance of differences between correlation coefficients *tf» b'ien. suggested by Fisher (193*0 as follows: z' = 1/2 [lege U + r) - log e (1-r)] , - - (8) 96 The standard, error would "be as follows o z , = 1/VN - 3 ~ .- -.--:-.___--_,-,,__ (9) The method may he illustrated from an example given "by Goulden (1937) who studied the relation "between the carotene content of wheat flour and the color of bread for 139 wheat varieties. , The. correlation. coefficients were as follows: Carotene in whole wheat with crumb color", r-i = -0.^951 ■ ' • Carotene in flour with crumb color, rp =' -0.5791 • The z' test would be applied as follows: z 'l = 1/2 [logg (1 + 0.1+95D - log e (1 ~ 0.1+951)] = 1/2 [log© 1.^951 - log e O.5049] a 1/2 lege 2.9612 == 0.5^28 = 1/2 log t>e 1.4951 0.50^9 • 2 = 1/2 [log e (1 + 0.5791) - log e (1 - 0.5791)] /2 log I4I91 I „ 1/2 log -3.7517 = 0.6612 0A209 I ', ee • 5 A = z{ - 0.6612 - 0.5428 = o.llSU °k'2 - zN = /I + -J- - 0.1213 1 V 136 136 dz'/o- z i = 0.1184/0.1213 = 0.9761 Since the difference is less than its standard error, it is not significant. The formula for z' deals only with the numerical value of r, no attention being paid to algebraic signs. It may be noted that the z test for significance of r is superior to the devices heretofore described. VIII. Interpretation of the Correlation Coefficient Certain precautions are necessary in correlation analysis. First of all, the charac- ters of the individuals under consideration must be paired for some logical reason. The sample should also be representative of the population. Ordinarily it is inad- visable to calculate correlations on numbers where N Is less than 30. Caution should be used in the application of correlation statistics where I is less than 50. Spurious correlation is a condition where the things compared are not causally relat- ed, hut which are related to a third cause. ■ Frequently there Is a tendency to assume that a significant correlation coefficient is proof of a causal relation between two variables. This may not be true. Extreme caution should he used in inferring cause from a correlation coefficient . S -- Linear R egres sion IX . Th eory of Regre ssion A regression is said to be linear when the means of the sets of values of one charac- ter which correspond to given values of the other character can be well -fitted gra- phically by a straight line. Under such conditions the coefficient of correlation (r) 97 is a valid measure of association. From the definition, it is evident that there must he two regression lines. They are termed the lines of regression of x on y, and y on x. y-distri- hution x-distribution A Jc X ■ ■ - , - *\* V \ 1 : \. ) B This diagram should give a clear con- ception of what is meant hy regression. The elliptical nature of the scatter is shown with the dots which indicate the means of the individuals in each array, fcoth horizontal and vertical. The means of all the rows fall approxi- mately in a straight line, as well as y those for columns. These lines, called the regression lines, intersect at a D point which indicates the means of the two general distributions, x and y. The mathematical equations of these lines can he obtained by the method of least squares. The line AB is the regression of x on y. Its equation is as follows: X - x - s x (y - y) H (9) Where x is the value estimated, say x e . The line CD is the regression line of y on x. Its equation is as follows: y - y = r _fy_ (x - x) (10) Where y in the val\ie estimated, say y e . These equations may be used to predict or estimate the most probable value of one character to accompany or be associated with a given value of the other character. When a certain value is given y in the equation x e " x = s' x /s'y (y - y) , one can solve for the predicted value of x that corresponds to it. Likewise when a value is given to x, in y e - y = s'y/s' x (x - t) , the most likely value for the y that accom- panies it can be found. Predicted values given by the regression equations are con- servative. Actually, the term "regression" is a result of this tendency. These equations have little or no value for prognosis unless r is quite strong. The two diagrams below depict predicted values given by regression equations in two case 3, i.e., where the correlation is strong and where it is weak. A x — »x Cane I 93 In each case, observe the same given value of x indicated "by the point 5c, on the upper line of each diagram. The vertical line from' the point, x, to the line CD- measures the predicted value of y. The portion (A B) illustrates the amount .of" ab- normality predicted. This is seen to "be much smaller in case II where the correla- tion is weak. Moreover, the standard error of an estimated value is so large that, unless "r" is high, the reliability of an estimated value is small. Although a sin- gle predicted value is of little avail unless a very high degree of association exists between two characters, the regression measured by the coefficients r b-^ / e£ and r a j / si ^sy be quite appreciable i.n one case or the other, even when r is small. This is due to the fact that variation in one character may be quite low. For in- stance, the association between the yield of a crop obtained from several plots and a certain treatment given in various degrees of intensity to the plots may be quite low. The first reaction would, be that the treatment is not justified. However, the regression might be appreciable, so that the treatment might be very worth while for the crop as a whole . The more important Interpretation of a px"edicted value from a regression equation lies in the fact that it may be considered as a mean estimated value of the variable which may be expected to result in connection with a number' of identical values of the second variable. Such an estimated mean would have a standard error = js^ where m = number of repeated cases of the second variable, and s e is the standard error of regression (See Section! (c) below). X. Computation of Regression Equations for Grouped Data The equation for the regression coefficient is as follows: b VT = SY(X - 1) <y " S (X - X.)'< The most convenient formulae for machine computation are as follows VT = ox l^.._.r_AJ ____ _ _ _' . ________ _____( 1 1) S (X -X.)2 U±j . s(x 2 ) - (sx)7n ," " " ' '" " K ~ ' or V = TIS(X Y) - (SX )(SY) _ (13) NS(X2) - (SX)2 Where the transformed variables (X,Y) are "not used, the same relations held in terms of the original variables (x, y) . i • ' ( a ) Comput ation of Regres sion Coeffi clents The computation may be illustrated for the correlation between total grain weight and average lengths of culms in wheat plants (Paragraph V above) . Calculations from Table 2, which can be used here, are as follows for coded data: SXY = 21,1+09 SX = 2736 SY = 3772 X a 5.538e 1 = 7.6356 N - I+9I+ SX 2 = 16,930 SY 2 = 31,108 By substitution in formula 13: b vx = JBlXYl^lSXKSYL = (W (2H109) ~ (2736) (3772) JX m (X2) - (pj)2 (W) (16, 950) - (2736)2 = 10, 576,01+6 - 10,320,192 = 2^5,851+ = 0.2913 8,363,1+20 - 7, ] +35,696 877,721+ 99 D = M5(XY) - (SX)(SY) = (I^IQ (21^09) - (2756)(3772) BS (Y*) - (SY) a (W( 31108) - (3772) 2 = 255, 85^ , 255, 85^- = 0.2246 15,367,352 - lk,227,9Qk 1,139,368 (h) Substitution in Regression Equation The equation for the regression of Y on X is as follows: Y e = I - b^ i + h yx X = 7.6356 - (0.2915) (5.5385) + (0.2915) x = 6.0211 -'0.2915X^/ The equation for the regression of X on Y is calculated in a similar manner. X@ = X - hxy Y -'- bxy Y = 5.5385 - (0.2246) (7.6356) + (0.2246) Y = 3.8235 + 0.2246 Y (c) Significance of Regression Coefficients The "t" test for significance of the regression coefficient, h yx = 0.2915 (coded "basis) can he determined as follows: S (Y - Y e ? = S (Y - Y) 2 - h 2 x S (X - X) 2 = SY 2 - Nf 2 - h 2 x (SX 2 - H£ 2 ) = 31,108 - (494)(7.6356) 2 - (0.2915) 2 [ 16,930 - (W(5o385) 2 ] = 31,108 - 28,801.3856 - 0.0850 (16,930 - 15,153.^500) = 2,306.6144 - (O.0850) (1,776.55) = 2155.6076 s e = S(Y - Y) 2 - b 2 S(X - X) 2 = /2155.6076 = 2.0932 F~rp -v ^92 byy j S (X - X) 2 = 0.2915 7l776.55 = 5.87 s e 2.0932 This indicates that the regression coefficient is highly significant. The coefficient, t, x = 0.2246, can he tested in a similar manner. ■"-The coded values are changed into actual values hy the conversion of Y to y, X to x, and the multiplication of hy X hy C„/C x as follows: y = (Y - 1) C y + y x = (7.6356 - 1)(5) + 62 = 95-1780 x = (X - 1) C x + x x = (5.5385 - 1)(20) + 9.5 = 100.2700 h yx = (0.2915) (5/20) = 0.0729 y e = y - h yx x + h yx x = 95.1780 - (0.0729) (100.27) + (0.0729)x = 87.8685 + 0.0729x. 100 1. Eldertcn, W. P. Frequency Curves and Correlation. Lay ton, I906.J 2. Ezekiel, M. Methods of Correlation Analysis. John Wiley' & Sons. ■ 1930. 3. Fisher, R, A. Statistical .Methods for Research Workers (dth edition) Oliver and Boyd. pp. I0O-I97. 193k. • ..• ; k. Gouldon, C. H. Methods of Statistical Analysis. Wiley, pp. 52-77. 1939. 5. Hayes, E. K., and G-arber, R, J. Breeding Crop Plants. McGraw-Hill. pp. il-3-T>5< 1927. 6. Snedecor, G. W. Statistical Methods. Collegiate Press, pp. 89-133. 1937- 7. Tippett, L. II. C. The Methods cf Statistics (2nd edition). Williams and Worgate. pp. l4o-l88. 1937. 8. Treloar, A. E. Outlines of Bio-metric Analysis. Burgess, pp. UO-63. 1933' 9. Wallace, H. A., and Snedecor, ,G. W. Correlation and Machine Calculation. Collegiate Press (Ames) . 1931. Quest i oris f or_ Bis cue si on 1. Define correlation. 2. What is a scatter diagram? How is it influenced by high correlation? Low correlation? 3. When are two variables said to be correlated? riot correlated? k. What is the generally accepted method for the measurement of correlation? Its limitations? 3. What is meant by r = + 1, r = - 1, and r = 0? 6. How can the standard error of the difference bo reduced by the use of correlation'; 7. Why is the "t" test preferable to the standard error test for testing the signi- ficance of r? 8. What precautions must be exercised in the interpretation of the correlation coefficient? Why? 9. Under what conditions is "r"'a valid measure of paired relationships? 10. What is regression? Its use? 11. Explain what is meant by the regression of y on x. Regression of x on y. Problems': .. 1. These data were collected to study the relationship between the soil moisture con- tent and the yield of wheat (Data from Salmon) : Mo i st ure ( x ) Y i e 1 d ( y ) Mo i st ur e ( x ) Yield (y) Meisture(x) Yleld(y) Moist ure (x) Yleld (y) 21 1 25 3d 18 10 le 3 17 1 23 • 2k 21 28 15 k 17 3 26 39 21 28 18 8 18 3 18 22 25 17 12 21 21 18 .0 23 29 17 13 20 2k 18 17 17 16 19 20 18 16 7 16 15 19 7 18 3 15 o 19 11 17 19 19 7 -> 19 9 19 10 16 21 15 11 18 23 18 k 16 21 15 9 18 23 21 k 16 20 13 9 18 27 - -13 36 2^ 32 15 9 18 23 26 k'J 2k 37 18 13 jl° 3 (a) Calculate the coefficient of correlation (r) by the machine method for ungrouped data, (b) Test the significance of "r" by the "t" test. 101 2. The average length of culms and the average diameter of culms was measured on k$6 wheat plants at the Colorado Experiment Station. The data follows: Av. Diameter culms (mm. ) Avera ge length of culms (cm.) : x 60 65 70 75 80 85 90 95 100 105 110 115 y 6k 69 7^ 79 8k 89 9k 99 10U 109 llU 119 91 -100 1 1 101 -110 1 2 111- -120 1 1 1 121- -130 1 k 2 1 2 1 131- -1^0 1 6 3 3 2 114-1- -150 2 3 1 6 lv 2 1 151- -160 2 5 5 16 27 8 7 1 161. -170 1 3 8 13 23 25 26 6 2 1 171- -180 3 1 9 10 15 26 13 8 8 181 • -190 1 9 10 10 23 32 1+ 3 191- -200 2 2 8 13 20 12 1 201- -210 2 2 3 5 k 1 211- -220 2 Calculate r and test it for significance with "t" test. 3. The correlation between the reaction to Helminthosporium in F3 and F5 barley lines was studied at the Minnesota Experiment Station. The reactions are given in per- centage infection for 1921 and 1922. The data follow: Percentage in 1921 Percentage in 1922 12 15 13 21 2k 27 9 12 15 18 21 2 2 2 1 1 3 2 3 1 1 2 3 2 1 1 1 k 2 1 1 Calculate the coefficient of correlation and the regression lines, regression lines. Plot the k. The 9-year average yields of wheat for the period 1921-29 were as follows when irrigated at the germination and filling stages. Five plots were averaged each year. The data follow in grams per plot: 102 Year Grown Genuine -it Ion (o S ) 192.1 ' 511 ± *7 1922 655 * 21 1925 91A t 52 1924 952 t 23 1925 1+70 ± 16 1926 557 - 29 1927 783 ± 20 1928 .- r ■" J r 066 - 20 1929 756 ± 65 Filling (gra.)(c£) •321 23 518 ~ 17 733 i 26 125 4. 3k 538 ± 18 601 -f - 31 812 i 511 •f 18 733 z 63 Means - 685 ± 11 6'57 * 11 Determine whether or not the wheat irrigated at germination differs significantly in yield from that irrigated at /joint in& and tillering. Calculate o c ] of an average of a difference "by the formula 07, - J a^ ■*• h 1 ^ , and by the extended formula for use; of r. Compare d/a& for both formulas. CHAPTER X THE ANALYSIS OF VARIANCE I. Generalized Standard Error Methods The "basis and purpose of all statistical methods is to analyze and measure variabili- ty. Variation between observations may be due to one or more recognizable causitive factors. In addition, in all statistical work, there occur variations between obser- vations that result from the coalition of a large aggregate of chance factors which defy control. This latter type of variation between observations results in various types of statistical distributions when it is attempted to describe homogeneous popu- lations. Suppose one considers a population wherein the variability may be due to both the combinations of innumerable chance factors and the non-homogeneity of the population. In other words, the population naturally and logically submits to sub-division into several homogeneous groups or sub -populations. Such a situation is common in variety tests in field experimentation. Generalized standard error methods have been devised for data of this kind. The purpose of the generalized standard error methods is to compute the standard error of an entire experiment in order to increase the accuracy of the estimate of error. In a variety test where each variety or treatment is replicated, say four times, the reliability of the results would be very low were one to compute the standard error for each variety separately. However, the estimate of error would be much more re- liable when computed on 10 different varieties, each replicated say four times, in the same experiment. In this case a total of hO plots would contribute to the esti- mate of error instead of four. The analysis of variance, developed by B. A. Fisher, has proved to be the most precise, flexible, and readily usable method available for the analysis of the results from field and many other biological experiments. It consists essentially in the -partit ion and apportionment of the total variation tc lbs known causes with a residual portion ascribable to unknown <:■:" uncontrolled variation and therefore called experimental err or. When the variability is measured in suitable terms, i.e., sums of squares of deviations about the means, the variability ascribed to the various causes will be strictly additive. The calculations are therefore extremely simple. The mean value of the sums of squares (mean square, variance, or standard error squared) is found by division of the sums of squares by the appropriate number of degrees of freedom. The literature on the analysis of variance has become very extensive during the past 15 years. It was first set forth in its complete form by Fisher and MacKenzie in 1923- For its application to field experiments, Fisher (193*0 } and Fisher and Vis- hart (1930) have given excellent discussions. Among other sources of information on the application of the analysis of variance to field experiments may be mentioned the books by Snedecor (193^ and 1937); end Tippett (1937), and papers by Eden and Fi3her (1929), Goulien (1931), Immer, et al (193*0, and Wishart, (1931). For a summary of the mathematical theorems involved in the analysis of variance, the work of Irwin (1931) is recommended. A — One Criterion of Classifi cation H. Theory of First Special Case Suppose a sample is formed from the general population with random samples of equal size taken from each of the eub -populations. In case m sub-samples contain n measure- ments each, the total sample will contain N = nm measurements. It is how proposed to -103- 104 analyzc the total variance, I.e.., ^ = sj_l__j___;. - .. . - - — — - . - - . (i) N - 1 -.■■.;. where x Is the mean of the total sample. Let x represent an individual measure of any (i r th). : sub -sample. Then, x - x a (x - %) + ( X1 - x) - - _ _ _ .. . . (2) where £]_ is the mean of the i-th sample. First, the above identity should be squared and suxamed for, all the n individuals which form the i-th sub-sample. The symbol 8' will be used for this summation. S' (x - £..)*£ S' (x - Xi ) 2 + 2(% - x) S'(x - %) + nCxj - x) 2 - - - - (3) Since S'(x - x^ ) = 0, it is evident that the second term on the right vanish- es. m Now suvvpose one simis over all the m different sub -groups by use of the. symbol, S. TYJ- _ Oil' ^ J -J The combination., SB' is simply S, or summation for all individuals of the total i sam- ' 1 Pis m . v o la , ' . o m . _ ' .0 S (x - x) = SS« (x - ±) d = SS • (x - %)- + n S (x 3 - x)'- - - - - - (k) The term on the left is the siim of the squares of the deviations of the individual observations from the means of the sub-samples . The second term on the right is n times the sum of squares of the deviations of the means of the sub-samples from the mean of the total sample. The computation of these three terms is most easily accomplished as follows: (1) Compute the term on the loft, S(x - x) 2 „ Sx^ - (Sx) 2 - - _..__.-__. _ ~ (5) N (2) Next, the second term on the right, n S(% - x) 2 = n S S, 2 - (So:) 2 = S(x a 2 } - (Sx) 2 ------ (6) 1 1 N TT — r - whore x a Is an abbreviation for S'(x), the total of the Variates in a single sub -sample . (3) The other term may be found by mere subtraction. The difficult thing to explain comes at this point. It would be easy to merely apportion, for the sample in question, the total variance into the variance within sub-samples and into that between sub- samples. Those two respective variances could be obtained by division of the first and second terms on the right of the identity b,y N. However, the real desire is to obtain the best estimate to the variance of the population as exhibited within the sub -populat ions on the one hand, and betw een the sub -p op ulat ions on the other. The best estimate of the total variance of the general population is given by :' s 2 * = SLx^ x) 2 ...--.-.. - - ... (T) N - 1 Where K - 1 (or nm - I) is the number of decrees of freedom. Likewise, the best estimate of the variance within cub -populat Ions (replicates agronomically) will be: O xl- m. . „ ,0 • /n\ V 1 __ ~_ 1 ' ' if- m" NJ/ "is estimated by" in this sense 105 Where N - m = m(n -. 1) la the number of degrees of freedom. This is true "because m separate means of the m sub-samples were used in the computation. This expression, s 2 r = the variance within sub-samples, is often called the residual variance . The last term, n 5 (x^- x) 2 , (Equation No. k above), must now he considered. At first glance, it would seem that the sum of squares of the deviations of the means of the sub -samples from the mean of the total sample when divided by m - 1, the number of degrees of freedom, would give a proper estimate of the variance between sub-popu- lations. However, xj_ does not represent the mean of the i-th sub -population, but rather the mean of the i-th sub-sample. Therefore, the difference, x±- x, is due to the combination of (1) the inherent nature of the i-th sub -population and (2) sampling fluctuations within the i-th sub-sample. Thus, n S (xj - x) 2 can be interpreted to m - 1 estimate the sum of ns^ 2 , n times the variance of the means of the sub -populations, and s r , the variance within sub -populat ions . It should be remarked that the expression, s r 2 = SS ' (x - Xj)^, called the "variance within sub -samples", is simply N - m one estimate of the variance of the total population. m _ -«2 ? o The term, n S (xj - x) = n s£ + s£, called the "variance between sub -samples", is 1 m - 1 n times the variance of the sub-sample (treatment) means about the total sample mean with st, added. On the assumption (null hypothesis) that the true variance of the sub- population means about the total population mean is zero, it then becomes clear that "variance within sub-samples" and "variance between sub-samples" are both independent estimates of the same concept, i.e., variance of the total population. The material may be placed in tabular form for clarity: Source of Sums of Variation Squares m p Between Sub-samples nS(x i - x) m - o "Within sub -samples SEP (x - x^) Total S(x - x) 2 Degrees Freedom of Estimated Mean Variances m - 1 n B t 2 + a /~ -~H 2 N - m °r 2 • V N - 1 8 a = v 2 The first two entries in the last ccl'^ian may be examined, i.e., the estimates given by the sample. These are ns-j-2 + B J£ . it is obvious that the fjrst should exceed the second unless s-^ 2 is zero. >2/ It is now desired to determine whether or not there is a significant variation be- tween the sub -populat ion, i.o.., whether c^- 2 is significantly different from zero. An estimate of o^ 2 may be made by subtraction of the estimate of cr r 2 from that of nc=t 2 - This result is then divided by n. ^Occasionally the reverse is true. This apparent contradiction is explained by the fact that the results are merely estimates which may be distorted to whatever extent sampling fluctuations may account . 106 So far it is obvious that tho first step in the analysis of variance serves two pur- poses: (l) It gives a method to test the homogeneity of a population; (2) It gives a convenient method to test tho dif f erencos "between several meant-; as a whole. Probably the best method to test for association between sub -populations is through the use of the "z" index devised ~by Fisher (193*0 • This affords a test of signifi- cance between two variances, e. g*, &y~ mid $2 - 7. ■■'. i los Bi' 11 - - 1 1 act Firs'- ■- « 1 O'.v ,q-i'- ___..__.__.._.-_,__ ( O) — p t. The "z." table devised, by Fisher (193^0 niay ho used to test these values for signifi- cance through use of the number of degrees of freedom pertinent to each computed variance. In this case, noj.- + erg- takes the place of s-g-; while cr r c takes tho place of So"-. Tests of significance may also be made Xrj means of the "F" test derived by Mahalono- bis (1952) and by Snedecor (193*0 ( 1937 ) • The table by Snedecor is the more extensive The value "F" is the quotient obtained by division of the larger ^y the smaller var- iance. The "F" and "z" tests arc; equivalent since z = - loggF. III. Computation for Single Criterion of Classification This case may be illustrated with some data for the yields of two barley varieties, (See Chapter o) . The yields in bushels per acre for the G-labron and 'Velvet varieties grown in single plots on 12 Minnesota farms were as follows: (Data from F. K. Immer) Velvet ( ro ) Total Farm No. Glabron (xi) 1 IlO, 2 kl 3 39 11 37 5 k6 52 7 51 8 37 Q k? 10 k? 11 '4-6 12 6k Totals (Sx) -3o Means (x) ■'+3. 3333 S ( X 2) r, p 0] 399 . 00 kl -7 r *- 30 32 kl Vi )+o 36 k2 39 k'( 30 309 V2A167 (§x)2 - k 9 . k 13 .38 N 91 0i|. i / 60 87 93 Q6 113 37 Sk 93 10 : ; 1089 Suppose that the total variability is separated into two components, viz., that "due to varieties" and that due to variation between plots of the ssme variety. The ex- pression "due to varieties" simply means that it is proposed to make varieties the criterion for the break --down, of the total sample into sub-samples. The sum of squares for total variation is found by summation of the squares of the 21 1 plot yields, e.g. ('+9) ; - + (i+7) + J - (.39) = 30,599? and the subtraction of the correction factor (SxV-/N from this value. 107 Algebraically, this is given "by: Total ssS^fx - X) 2 = S(x 2 ) - (Sx) 2 /N = 50,599-00 - 49,413.38 = 1185.62 The sum of squares for. varieties is obtained "by the summation of the squares for the two totals for varieties, dividing "by number of plots or values contained in each variety total, and subtracting the correction factor, e.g. Between Varieties = n£> (x^ - x) 2 = SXy 2 - (Sx) 2 1 "~n~ T~ = 595, 481. 00 - 49,413.38 12 = 4Q,623.42 - 49,413.38 = 210.04 The sum cf squares for within varieties, here used as error, is the remainder after subtraction of the sums of squares for varieties from the total, e.g. H.85.62 — 210.04 = 975.58. The analysis of variance follows: Variation due to D.F. Sum Squares Mean Square Standard Error(s) z -value obtained 5 pet . point F Between Varieties Within Varieties 1 22 210.04 975.58 210.0400 44.3446 6.6592 0.7777 0.72Q4 4.737 Total 23 1185.62 The degrees of freedom for "between varieties" and "total 11 are one less than the number of varieties and total number of plots, respectively. The degrees of freedom for within varieties are those for a single variety (11 in this case) multiplied by the number of varieties (2 in this case), or (2) (11) = 22. The mean squares are obtained by division of the sums of squares by their respective degrees of freedom. The standard error of a single determination (s) is the square root of the mean square for error (or variance) . The z-test may be used to test the significance for variance "between varieties" and that "within varieties". The value, z, is l/2 log e of the difference of the variances to be compared. The values of the logarithms needed in computing z are found in a table of natural logarithms (See Table 4, appendix). V = 1/2 log e 210.04 - 1/2 log e 44.3446 = 1/2 loge f 210.0400 \= 1/2 log 4.737 * 0.7777 N^S (x - x) 2 = Sx 2 - 2x S(x) + Nx 2 = Sx2 - 2x«Nx rf Nx 2 = Sx2 - Nx2 Since Nx = Sjfx) Sx 2 - Nic 2 = Sx 2 - S(x)x = Sx 2 - (Sx) 2 /U The decimal point may bo moved to the left on the mean square values to shorten- the work, so long as the resultant numbers arc greater than 1.0. The true log e values will not be obtained but tho difference of z -value is unaffected. A shift cf decimals is particularly desirable, when any of tho mean squares are less than 1.0 to avoid taking a negative log,* . 108 The theoretical z -value .is looked up in the table given, by Fisher . (193M-* where N] in the number of degrees of freedom for the larger : and Njb '^ ie degrees of freedom for the smaller variance. In this case z - 0.729'+ for the theoretical value for the j per cent point. However, the interpretation is made more -easily by using the "F" value. The "F" value is the quotient of the larger by the smaller variance, e,g., F = 210. 0U/ kk.jkkS = k.'jk. In Snedeeor's table (Table 2, Appendix) for Nn - 1 D»F, and N 2 = 22 D.F., it is found that the observed "F" lies between the J.O per cent and 1.0 per cent points. The theoretical value for the S.O per sent point is U.JO. It may be noted that F = f-- for one degree of freedom. IV. The More General Case Suppose that the number of observations in each sub-sample varies, and that they are represented by n.] , n.g n,-,. Then F = Sn± , The equation for the sample is as follows: S(x - %) d -- SS'(x - x) c - = SS'(x - Xj) d •+ Sn;(xi - x)< : ' (10) 1 j. 1 m _ o Again, SS (x - x^) divided by II - m, the degrees of freedom, will give the variance within^ a group. However, it is now impossible to arrive at an estimate of s+g- be- cause Si:u (x_-| - x)^ is affected by the different number of observations in each sub- 1 ■ o Irl / \ o sample. Therefore suppose that &*.<• is trulv aero so that S n- : (x.- - x)^ will estimate ,.-J ■'■ - ll - -I X * J. '__ r • m - 1 This assumption may be tested for the existence of an association between sub-popu- lations. To do this, the valuer: SB' (x - xi) r ~ and £> ni(xi •- xf are compared for a significant difference. '- if - m -------- In the field of agronomic experimentation this situation is rarely found because the experiments are designed to permit a simpler set-up for the computation of the statin- t i cal constant s . B -- Two or More Criteria of Classification V . Theory of th e Exte nded Case of the Analysis of Variance . Frequently, the complexity of the experiment that affords the data makes it necessary to analyze the total variance into more than two parts in order to make the most of the possibilities. First, re-examine the tabular arrangement for the first special case, (Paragraph II) . Suppose the classification of the total population into sub- populations, which forms the basis of the above analysis, be termed classification "A". Now suppose the total population lends itself to an independent classification, "B". which contains "in' 1 ' classes. For simplicity, assume that the sample sub-divides evenly for this classification' with "n"' observations in each class. Thus, N = nm = n'm' . Previously, the heterogeneity in the total population for classification "A" was test- ed "oy a comparison of V + 2 with 1 2 .2 . It was necessary to tacitly assume that each sub -population for classification "A" was homogenous. Now, if the population submits to a new classification "B", it is quite likely that the original sub -populations were not homogenous if classification "B' : has any logical basis. Lack of homogeneity in the sub -populations increases the variance therein. The residual variance, Vpg-, may- be so affected in the comparison between Vjv- and 'vie- that the differences between groups for classification "A''' may appear to be insignificant when the opposite is true . ■: •. 109 Therefore, It is proposed to remove from the squared residual errors, SS ' (x - x^) 2 , the sum of the squared errors between groups for classification "B". ^This amount will he termed n' a (x< - xjr, It represents m' - 1 degrees of freedom, while x< j=l J d indicates the mean of the j-th class of classification "B". The mean variance "between groups for classification "B" will he designated as V^, 2 . At first, one might expect the mean residual variance (Vp2) to be definitely reduced in this manner, regardless of any justification for classification "B". This is not true because the reduced sum of the residual squared errors now represents only N-m-m , + 1 degrees of freedom where N - ra degrees of freedom were represented before. Thus, it is apparent that the new Vr 2 will not differ sensibly from its former value, should the differences between groups be insignificant for classification "B". How- ever, the greater the significance of the differences between the groups for classi- fication "B", the more markedly Vp 2 will be reduced. Then the ratio V^/^will be sensibly increased, with the result that the test for significance of differences between groups for classification "A" is strengthened. The new tabular arrangement of the analysis is as follows: Source of Variation Between groups (A) Between groups (B) Residual Total Sum of Squares n S (xi - x) 2 1 &' / *? n' S (x* - x) c 1 J S (x - x) 2 Degrees of Freedom m - 1 m 1 - 1 N - m - m' + 1 K - 1 Mean Variance V, tr 2 f B V 2 (or s 2 ) The entry for the sum of squares of the residual errors is left blank because, in computation, it would be found by subtraction. This process may be extended in the same manner to take into account other possible classifications which might contribute to the heterogenous character of the original population. The object is for the residual variance to represent variance due to chance alone as nearly as possible. Furthermore, an increase in the scope of an ex- periment will proportionately increase, to within differences due to sampling fluc- tuations, all the sums of squared deviations incorporated into the analysis. Ilence, V^ and V-t« will be increased proportionately since the number of degrees of free- dom they respectively represent are unchanged. The value V^ 2 will be increased to a lesser extent due to the fact that the number of degrees of freedom represented will be more than proportionately increased. Thus, V^/Vg and V^,/V E will be increased which, together with the fact that a smaller z value is required to prove signifi- cance, make it more likely that positive conclusions can be drawn from the analysis of variance. VI. Computation for Two Criteria of Classification The same data on the yields of G-labron and Velvet barley varieties are used to illus- trate this case. It is desired to determine whether or not there is a significant variation from farm to farm as well as between varieties. Hence, the computations will be for total variance, that due to farms, and that due to varieties. The resi- dual variance will be obtained by subtraction. 110 s(x - x) 2 - s (x 2 ) - (Sx) 2 /it = 50,599.00 - 1+9,1+13.33 = 1185.62 n ff(£ i - x) 2 = S(x v ) - (SxT = 595,l!-8l.OO - 1+9,1+13.38 = 210.01+ • 1 n 12 n « &(x. - x)2 = S(x f 2) _ (Sx)2 = 100, 269.00 - 49,1+13.38 = 721.12 1 N 2 The subscripts, v and f evidently indicate "varieties" and "farms". The new tabular arrangement now becomes: Variation due to D.F. Sums Squares Mean Square Standard Error (s) F -value Farms Varieties Error Total 11 1 11 721.12 210.0'+ 25^ M 210.0^00 23.1327 1+.8096 9.08 23 1185.62 When the F -table is consulted it is found that an F-value of 1+.81+ is required for the 5 per cent point. Thus, the added refinement through the removal of the variation between farms greatly increased the significance of the difference due to varieties. VII. Introduction to Analysis of Variance in Agrl cultural Exp eriment a The principal difficulty to contend with in field experiments is the variation in soil fertility over the area used in experimentation. The natural fertility usually varies continuously. The art of planning an experiment lies in the arrangement of the varieties, treatments or conditions under investigation in nearby plots. They are usually placed within as small a land, area as is practically feasible. The entire- arrangement is then replicated over a larger area so that the variations caused by regional changes in fertility may be removed from the comparison. The randomized block arrangement, and its more restricted form, the latin square arrangement, are commonly' used to make possible the removal of the general effect of soil heterogeneity by means of the analysis of variance iments 1 ' . ) (See Chapter en "Design of Simple Field Exper- In the use of the analysis of variance in field experiments it is assumed that the distribution of the plot yields is normal. I.e., that it fits the normal curve. The "agronomist is familiar with the fact that the variability between plots of the same variety grown on land of high fertility is often less than between similar plots of low fertility, The variability among plots of high fertility may be considered as restricted by what may be termed "ceiling effect" which imparts an abnormal distribu- tion to the population. Fisher and others (1932) found evidence of negative skowness in heights of barley plants selected at random from plots that received, various nitrogen treatments. Eden -and Yates (1935) obtained similar results with height measurements of wheat plants. They made a practical test on these data to determine whether the validity of the z-test would be destroyed by such non-normal data. They concluded that the z-test could be safely applied. . Refere nces 1. Eden, T. and Fisher, R. A. Studies in Crop Variation VI. Experiments on the Response of the potato to potash and nitrogen. Jour. Agr. Sci. 19:201-213. Ill 2. Eden, T. and Yates, F. On the validity of Fisher's Z test when applied to an actual example of non-normal data. Jour. Agr. Sci. 23:6-16. 1933 • 3. Fisher, R. A. Statistical methods for research workers. Oliver and Boyd, Edinburgh, Ed. 5. pp. 199-231. 193^. k. Fisher, R. A., Immer, F. R., and Tcdin, Olof. The genetical interpretation of statistics of the third degree in the study of quantitative inheritance. Genetics 17:107.-124. 1932. 5. Fisher, R. A. and MacKenzie, W. A. Studies in crop variation. Jour. Agr. Sci. 13:311-320. 1923. 6. Fisher, R. A. and Wishart, J. The arrangement of field experiments and the sta- tistical reduction of the results. "Imperial Bureau of Soil Science, Tech. Comm. No. 10. 1930. 7. Goulden, C. H. Modern methods of field experimentation. Sci. Agr. 11:681-701. 1931. 8. Immer, F. R., Hayes, H. K., and Powers, LeRoy. Statistical determination of barley varietal adaptation. Jour. Am. Soc. Agron. 26:403-419. 1934. 9. Irwin, J. 0. Mathematical theorems involved in the analysis of variance. Jour. Royal Stat. Soc. 94:284-300. 1931. 10. Mahalanobis, P. C. Auxilliary tables for Fisher's Z test in analysis of variance, Indian Jour. Agr. Sci. 2:679-693. 1932. 11. Snedecor, George W. Calculation and interpretation of analysis of variance and covariance. Collegiate Press, Inc., Ames ; Iowa. 193^ • 12. Snedecor, G. "W . Statistical Methods. Collegiate Press, \mes. pp. 171-218. 1937. 13. Tippett, L. H. C. The Methods of Statistics. Williams and Norgate, London. (2nd edition), pp. 125-139. 1937". 14. Wishart, John. The analysis of variance illustrated in its application to a com- plex agricultural experiment on sugar beets. Wissenschaftliches Archiv fur Landwirtechaft, 5:561-584. 1931. Questions for Discussion 1. What is meant by generalized standard error methods? Why are they useful in agronomic experiments? 2 . What are the general features of the analysis of variance? 3. What is the basis of sub-division of the sample for one criterion of classifica- tion? 4 . Why is it logical to use the variance for within varieties to compare with that between varieties? 5. What is the "z" test for significance? "F" test? 6. How may the sub-division of the total sample into two criteria of classification strengthen the experiment? 7. What assumptions are made in the use of the analysis of variance for plot yield data? Problems 1. Yield data in bushels per acre for 5 wheat varieties are given on the following page: 11: Replications Variety 12 3 Total 32.^ 3U.3 37.3 10^.0 ■r £ B 20.2 27.5 25.9 73.6 c 29.2 27.3 30.2 87.2 D 12.8 12.3 ll:-. 8 39.9 1 . 21.7 gl+,5 23 Jl - 69.6 Totals 116.3 126.1+ 131.6 37I+.3 (a) Calculate the analysis of variance for one criterion of classification, i.e., "between and within varieties. • (b) Obtain the "!?" value and determine whether or not the varieties differ significantly in yield. (c.) Use the "?." test to determine significance. 2. Calculate the data in problem. 1 for 2 criteria of classification, i.e.., replicates and varieties. Determine whether or not the varieties differ significantly in yield by use of the ;; F" test. . CHAPTER XI COVARIANCE WITH SPECIAL REFERENCE TO REGRESSION I. Relationship of Covariance, Correlation, and Regression The concepts of covariance, correlation, and regression are interwoven, being funda- mentally equivalent. Suppose one considers N pairs of measures that relate to two characters represented "by the variables x and y. In the chapter on correlation, it was seen that the basis for the measurement of correlation and regression was the product sum, S(x - X)(y - y). The entire subject can well be treated by the analysis of variance principle. II. Analysis of Covariance Suppose the sample of N pairs of measures is divided into m sub-samples that contain n pairs of variates each. Let x-^ and y-j_ represent the pair of means that correspond to the i-th sub-sample. Then, for any pair of variates in the i-th sub-sample this equation can be formed: (x - x)(y - y) = [(x - Xi ) - (% - x)] [(y - f ± ) + {f ± - y) j - - - (l) By an analogous procedure to the first treatment of the analysis of variance, the right side of this expression may be expanded and summed for all the pairs of variates in the i-th sub-sample, viz., 3' (x - x)(y - y) = S» (x - x^Cy - f ± ) + n (% - x) (y ;l - y) . It is noticed that the two middle terms of the expansion become zero for the summa- tion. The summation is taken again to include all the sub-samples, viz., !fe'(x - x)(y - y) = S (x - x)(y - y) == Ss»(x - x i )(y - y ± ) 1 1 + n | [± ± - x)(y x - y) - . - (2) The total covariance or correlation, S (x - X)(y - y), may be most easily computed by this formula: S (x - x)(y - y) = Sxy - (Sx)(Sy ) - (5) m N means , The term, nS(x^ - x)(y i - y) , which measures the covariance between sub-sample can be computed as follows: m m m a X nS^ - x)(yi - y) = nS x^i - (Sx)(Sy) = S X 2 y£ - (Sx)(Sy) (k) 1 1 N 1 a ^ a n where x a and y a are abbreviations for S'x and S»y the sums of the variates in a single sub-sample. m The term, SS'(x - %)(y - fi), which measures the covariance within the sub-samples, can be found by subtraction. The computation is analogous to the ordinary case of analysis of variance. In fact, it should be incorporated with it for each variable separately. An illustrative example will make the computation and analysis clear. -113- III. Computation of Co variance \j/ The data used to illustrate thin problem involve height measurements of 5 plants, from each of 13 inbred lines of sweet clover together with a determination of the percent- age of leaves ("by weight) on each of these plants. The data are given in table 1 for height in inches (x) and per cent leaves (y) for each of the 65 plants.. Table 1. Data on Height and Percent Leaves of 5 Plants from each of 13 Lines of Sweet Clover Plant Numb er Toi ,el ' Line 1 2 3 1+ 5 Height Sx Leaves No. X 7 x y X X y y~ 5 Sy (In.) do) (In.) (*) (In.) (i) (In.) (*) (In.) W 1 63 ^ .66 38 39 ho 62 39 69 1+0 319 202 2 TO 33 77 37 6k 39 53 30 61 1+0 3?-5 192. 3 51+ 37 51 50 56 1+9 61 35 56 1+9 2 T 8 220 k i+o 50 39 kk lOfr k2 38 1+3 1+5 1+5 206 22)+ 5 30 k9 1+0 30 ■U-I4- k2 39 1+3 1+0 kk 199 23O. 6 1+1+ 1+8 50 30 54 kk 5t 1+2 •36 kk 238 228 7 58 58 60 38 58 1+2 60 1+0 6k ko 300 198 8 5*i 14-2 52 i+8 Jib 1+0 52 1+8 kb kl 230 225 9 52 1+1 36 39 52 1+2 kb 1+3 1+8 1+2 231+ 207 10 38 1*0 kS 1+1 52 39 52 1+0 60 39 270 199 11 63 kl $h 1+2 38 37 r- O 3d 1+0 5'!- ko 292 200 12 1+8 50 } + 5 33 1+3 c-"0 1+1 k9 l+l 53 220 239 13 1+3 1+7 31 in 1+5 - f ;:>o ll-l 1+3 l,k i+6 20l+ 233 Tot, 3375 2819 Since there was no replication of these lines the total variability will be divided into only two components: (l) between lines end (2) between plants within lines. Let the height of plants be designated as (x) and. the per cent leaves be designated as (y) . The sum of squares for total variation in height of plants will be: S(x 2 ) - (Sx) 2 /lT = 180,831.0 - 175,21+0.1+ = 3590.6 The sum of squares for variation between lines is calculated from the sums of five plants per line as follows: S(x 2 a ) - (Sx) 2 = 89 8,1+67 - 175.21+D.1+ = 1+1+53.0 -J— N 5 In like manner the total sum of squares for per cent leaves will be: s(y 2 ) - (Sy) 2 = 123,7V? - 122,257 = li+89.1 N The sum of squares for the 13 lines in per cent leaves will be: S(y 2 .j) - (Sy) 2 - 613,715 -• 122 , 237-9 - 88I+.7 ~5~~" N 5 ■*■ This illustrative example is one prepared by Dr. F. R. Immer ;) with minor modifica- tions . 115 The sum of products for total variation will "be obtained "by multiplication of each plant height by the per cent leaves on that plant. The results are then summed. This will be: S(xy) - (Sx)(Sy) = 11*, 697 - 1*6,371.2 = -l67*.2 K The sum of products for variation between lines is obtained by a similar process, viz., S(x hyi ) - (Sx)(Sy) = 724,01* - 1*6,371.2 = *1568.* 5 N 5 The analysis of variance and co-variance table can now be constructed as given in table 2. Table 2. Analysis of variance and co -variance Variation due to: D.F. Sum of Squares x 2 xy due to: 1* Mean Sq. due to: x 2 y 2 Lines (Between) Within Line 3 (Error) 12 52 **53.0 -1568.* 1137.6 -105.8 88*. 7 60*.* 371. 08**73 .72** 21.88 11.62 Total 6* 5590.6 -167*.2 1*89.1 **Exceeds the 1 per cent points. The sums of squares and sum of products for variation between plants within lines is obtained by subtraction. Differences between lines with regard to height of plants (x), and percentage of leaves (y), may now be tested separately for significance in the ordinary manner. It is noted that these lines were significantly different in both height of plants (x) and per cent leaves (y), the mean square for lines compared with error being greater than the 1 per cent point. However, there is no method to determine the significance of co-variance itself (xy). That is determined by tests of significance performed on correlation or regression coefficients calculated from it. This problem will be considered next. IV. Calculation of Correlation and Regression Coefficients The coefficients of correlation can be calculated directly from the sums of squares, since r = S(x - x) (y - y) - - - - - (5) Vs(x - x) 2 Vs(y - y) 2 By substitution of the sums of squares and products for variation between lines, given in table 2, one obtains: r= -1568.* = -.790 -JW&O 788*77 The other correlation coefficients can be calculated in like manner from table 2, merely by substitution of the sums of squares and products found in the appropriate row in the table, for the source of variability to be considered. The coefficient cf regression of y on x will be given by b' S(x - xUy - v) i.e., prediction of y from x. S (x - \)' d 116 By substitution of the sum of products and sum of squares for "lines" from table 2, b = -1566.U = -.5522.. Wyy . The correlation and regression coefficients are given in table 5. Table 3* Coefficients of cor relat ion and regression Correlation between Regression of height (x) and per per cent leaves on Variation due to: ^JE.' cent leaves (y) _ height (y on x ) Lines (Between) 11 -.790** -.3522 Withi n Lines _J>1 - . XjJjf -.0930 Total'"' "6s[_ "" "-.580 ** -".299^ " **Exceeds the 1 nor cent point of Fisher's table V. A. o z - 1 VN - 3" From Fisher's table V.A. it is seen that r ~ .790 is greater than the expected value of r for ? = ,01. The chances are, therefore, in excess of 99 ; 1 against the occur- rence of so large a correlation coefficient thru errors of random sampling from un- corrected material. The degrees of freedom for Fisher's table V.A. are 2 less than the number of pairs in the sample and would, therefore, be one less than the degrees of freedom in table 2 . The correlation coefficient within Lines, r •-- -.114 is not significant. The degrees of freedom are 51 in this case. V. Tests for Signif i oance for P egr eo s 1 on Coefficients The regression coefficients can be tested for significance by means of an analysis of variance or b? means of a "t" test. The former method will be illustrated first. ( a ) Test by Analysis of "variance Suppose there exists a linear regression of percentage of leaves (y) on plant height (x) . Then y 0; . the estimated percentage of leaves from a sample of N pairs of values of y and x, is given by the regression equation: y e = a + b(x - x). -.-•---."--".-'-- ------- ----- ----- (6) In this equation, a = y and b = S(y - y) (,X - x) are estimates of the true mean per- slx-lc'F" centage of leaves and the true regression coefficient, respectively. Since the regression equation can be written as y - y e - b(.X - x), it is apparent that : S(y - y) 2 = sfy -[y fc - b (x - *)] [ 2 = £(y - y e ) 2 + 2bS(y - y e )(s - x) 4 b 2 S(x - x) 2 Due to the fact that the middle term on the right is zero, v V Consider S(y-y e )(x-S) Since y Q = 4 y + b (x - x) , we have: S [ y - y - b(x - x) | (x - x) or S(j - y)(x - x) -bS(x - x) 2 . It is clear that the whole expression is zero due to the fact that b * S(y - y)(x -j£j . S (x - x) 2 S (y - y) 2 = S(y - y e ) 2 + b^Cx - x) 2 117 (7) Thus, the sum of the squares of the deviations of the percentage of leaves has "been analyzed into two components, one dependent on b, and therefore ascribable to regres- sion, and the other a sum of squares that represents deviation from regression or residual. Since b = S(y - y)(x - x) , it is obvious that the value of b S(x - 3c) 2 , the component S(x - £)' d ascribable to regression will he: b 2 S(x - x) 2 = Cs(x - Z)(y~- 7)32 - (8) S(x - if This procedure is now applied to the illustrative problem. The values from table 2 will be used to compute total regression: S(y - y) 2 = 1489.10 [s(x - x)(y - y^l 2 = (-1674. 2 ) 2 = 501.57 s(x - x) 2 5590 . 6 The analysis to test the significance of total regression follows in table 4. Table 4. Analysis of Variance to Test the Significance of Total Regression Variation D.F. Sum of Products Mean Product F -value Due to Regression Deviations from Regress. 1 65 501 .37 987.73 501.37 15.68 31.98** Total 64 3M9.10 The total sum of squares for y (leaf percentage) is taken directly from table 2. Here y is used as the dependent variable, i.e., y (leaf percentage) is predicted from x (plant height) which is known. The sum of squares due to deviations from regres- sion is obtained by subtraction, i.e., .U1-89.IO - 501.37 = 987.73. There will be one degree of freedom due to linear regression with a remainder of N-2 degrees of freedom for deviations from regression. It is also to be noted that N-2 is the number of degrees of freedom used to test the significance of r (Fisher, Table V.A). It is obvious from table 4 that the regression coefficient is highly significant, since the "F" value exceeds the one per cent point. The same conclusion was obtained when r was tested for significance. In fact, the two tests for significance are equivalent. When the correlation coefficient is significant, the regression coefficient must be significant, and vice versa. To test for the significance of regression between lines, the values already computed for that source of variation in table 2 are used: S(y - y) 2 = 884.7 CS(* - x)( y r J±P a L4168J4_}2 . 552.4 s(x - s)2 TT53 .0 The values are summarized in table c j\ Table % Analysis of Variance to Test the Significance of Regression Between Lines Variation D.F. Sums of Products Mean Product F -value Due to regression 1 Deviations from regress. 11 552.4 332.3 552.40 30.21 18.29** Total 12 884.7 1.18 It is thus evident that the regression between lines is extremely significant. The regression within lines will not ho tested for significance since r is not signifi- cant (See table 3) • : '0 The "t" Test of. S ignificance Regression coefficients may he tested by means of the "t" test also (See Fisher _, 193 J v Pp. 126-137) • As an illustration, the significance of the regression of y on x between. lines may be tested. From table 3 ; b = r0'.yj>22 } S(x - x) = 44.53.0, and s(y - y)' c = 83h . 7 . Then ., where Tlien, t - b Vs (x - x) 2 - - - - ---..-- - ~, r - (9) P s 2 = S(y - y) 2 j^^ix^ £1? ~ --~™ - (10) m - 2 06^.7 zJ&iE&^MShQl - 22L-JL = 5°- 21 13-2 11 .496 = °« 3-^22 V4455 .O = 4,276 for 11 D.F. J r - .496 From the "t" table it is obvious that the observe! t -value exceeds the 1.0 per cent point . Since -fT = t for one degree of freedom, it is noted that -/F = VlS.39 ■ = 4.277 (from table 5). Thus, it is apparent that tests of significance of regression coefficients by means of the analysis of variance arid the "t" test are equivalent. Moreover, they give the same result as tests of significance of the correlation coefficient (Table V Ag Fisher, 193'+) • VI. Substitution in Regression Equation The regression equation is usually expressed as y e = y + h(x - x) . For such a regression between lines one may substitute y - kj.yjjX = 51.92 and b = -.3p22. The mean values of y and x are obtained directly from table 1 "oj division of the totals by 65. The value of b is taken from table 3. Numerically y Q - 43.37 - 0.3522 (x - 51.92). This regression equation can be simplified to y e ~ 61.66 - O.3522 x, where x is any value of plant height. In table 6 is' given the mean height of' each line, the mean leaf percentage of each line and the leaf percentage predicted from plant height by means of the equation above. Table 6. Observed Mean Height', Mean Leaf Percentage and Predicted Leaf Percentage of the 13 Lines of Sweet Clover. Observed Observed Predict el Observed Observed Pr edict ed Line mean mean $ mean $ Line me an mean $ mean \ ho . height leaves leaves (y e ) No. he i flit leaves le aves (y e ) (x) • (y) U) (y) 1 63.8 4o.4 39-2 8 50.0 45.0 44 . 2 65.O 33,4 58.8 9 50.8 . 41.2 43.8 3 35.6 44.0 42.1 • 10 s4 . 39.8 42 . 6 4 4l .2 44.8 47.1 11 58.4 40.0 4.1.1 5 39-3 46.0 47.6 12 44.0 31.8 46.2 6 7 51.6 60.0 45.6 39.6 43.5 40.5. 13 40.8 47.0 V7.3 119 The differences between observed mean leaf percentage and predicted leaf percentage in table 6 represent errors in prediction. The sum of squares of these differences would be given by S(y - y e ) where y represents the observed mean leaf percentage and y s tho predicted value. This quantity can bo computed from table 5 hy subtraction of the observed and predicted leaf percentage, these values being squared and added to give S(y - y e ) - 66. ')k. "Now this sum of squares is based on means of 5 plants per line while the analysis of variance in table k was on a single plant basis. Therefore, multiplication of 66. ^k by 5 to place it on a single plant basis gives 332.7 • This agrees with the sum of squares due to deviation from regression, i.e., 332.3 consider- ing that the predicted leaf percentages have been computed to only one place of deci- mals. It may be noted also that s 2 used in the "t" test could be written s 2 = S(y - y ft ) 2 , m - 2 since S(y - y e ) 2 = S(y - y)~ - b w S(x - x) , the latter form being simpler for compu- tation purposes. While the application of analysis of variance and co-variance to correlation and re- gression problems has been illustrated here with data from a very simple experiment, it is evident that it is equally applicable to problems of any degree of complexity.^ The analysis of variance and co- variance are keyed out for the particular problem under investigation after which the correlation and regression coefficients are cal- culated for scay component of the total variability. The tests of significance are made in a manner similar to the ones illustrated. VII. Use of Covariance The analysis of covariance is often successfully applied in an artificial reduction of experimental error in certain types of experiments where preliminary or uniformity trial data are available. There may be factors which it is impossible to equalise satisfactorily between the different treatments, and yet there may be reason to sup- pose that greater accuracy would arise from their equalization, were that possible. Availability of preliminary data may provide the basis for such equalization. The possible use of data from a previous uniformity trial to reduce errors due to soil heterogeneity in the experimental years has been given considerable attention in field trials in recent years. The assumption is that soil fertility is constant from year to year. Thus, a significant correlation between the seme plots in successive years may be used to reduce the error in the experimental year. The regression equa- tion is applied to predict the yields in the experimental year from the yields of the same plots grown under uniform treatment in the previous year. The deviations from the predicted yields should then contribute to the error of the experiment . Methods to utilize information from previous crop records have been outlined by Fisher (193*0* Sanders (1930), Eden (1931), and by Wishart and Sanders (1935). With annual crops, Summerby (193*0 found that it was not worthwhile to sacrifice a year to a uniformity trial in order to obtain information to reduce the error in the experimental year. The method seems to have the greatest possibilities with perennial crops. (See Fisher, 1937). Another possible application of covariance arises where stand counts are available in addition to yield. Mahoney and Baton (1939) have made such an application. Stand counts may furnish a good index of plot variability provided they have been unaffect- ed by treatment. Correction for stand, which can be made from the regression relation. VFor a consideration of curvilinear regression and its treatment by the analysis of variance, the reader is referred to more advanced works on the subject. ISO provides an adjustment of the data to what they would "be if all plots had the name number of plants (proportionality assumed). When, yield, is related to plant number it is obvious that the experimental error will be decreased when this factor is taken' ' Into account and a correction made for it. It is first necessary to determine whothe? or not such a relationship exists. The simpler aspects of covarlance as applied to between and within groups have al- ready "been considered. The method will be used here for ordinary field experiments where the total variation is sub -divided into more than two parts. An illustrative example used by Fisher (193'M will be followed. VIII. Use of Preliminary Trial Data for Srror Reduction _Si ' Some, data collected by Eden (19,31) Sf on tea will bo used to Illustrate the calcula- tions for cova.ria.iice for preliminary and experimental yields. Four "dummy" treat- ments for yields of tea expressed in per cent of the mean in a randomized blocl: ex- periment are given in table 7. ' Table 7. Preliminary and Experimental Yields of Tea Plants Pr el iminary ( x ) or Blocks _ Treatment Treatment Experimental (y) 1 2 5 4 Total Mean m. A x 91 118 109 102 y 6? 121 114 107 3 x 68 94 105 91 y 8l 93 106 92 C x 88 110 115 96 y 90 106 111 102 D x ■ 102 109 94 88 Q*5 ] "i 1| s 7 95 10! .9: Block Total x 369 k : )l 423 . . •■ . y . 349 ' k$k 424 420 103.00 427 106.75 378 94 . 50 .572 93.00 409 102.25 409 .102.20 593 98.2p '•SQp 98.00 loOO 1800 The preliminary yields will be designated as x and the experimental yields as y in the subsequent calculations. ' '-■ '■-■'' (a) Analysis of Vari ance and Covarlance for Pr eliff d.narv and Experimental ^Yields . The sums of squares for preliminary yields: Total: Sx2 - (g x)2 - 1326.0 " N Treatments: Sx^ - ( S: 0~ - 253.3 Blocks: Sx| (Sx) 2 ~ 743.0 Sums of squares for experimental yields: Total: Sy2 - (Sy) 2 = 2040.0 ,. ' IT ' •■.;."■■ ■ Blocks: Sy2 - (Sy)£ = 1099- 5 '_ Treatments: Sy? - (Sy)2 = 414.5 : *F ' N lfM- Cited by P. A. Fisher (1934) 121 Suns of Products: Total: Sxy - (Sx)(Sy) = 1612.00 Treatments: Sx t y t _ (Sx)(Sy) = 323-25 ~ t » Blocks: Sx t y t - (Sx)(Sy) = 837. The above results are incorporated In table 8. Table 8. Analysis of Variance and Covarianco Variation due to Sums of . Squares D.F. U) (y) Sums of Products (*y) Mean Square s (x) (y) F -Value (x) (y) Blocks 3 7^5.0 1095.5 837.OO 2^8.33 365.17 Treatments 3 253.5 hlk.5 323.25 81+. 50 138.17 Error 9 527-5 530.0 ^51-75 58.61 58.89 k.2k* 6.20* l.M- 2.35 Total 15 1526.0 2C40.0 1612.00 From this analysis it is clear that no significance resulted between yields in case of the "dummy" treatments, while a considerable degree of soil heterogeneity evidently exists because the variation between blocks proved significant for both the prelimi- nary and experimental data. It is now proposed to test the covariance as a basis to provide a correction for the mean experimental yields in an effort to reduce the soil heterogeneity effect further. The analysis of covariance is given in table 9* Table 9- Analysis of Covariance and Test of Significance of Adjusted Experimental Means Variation due to D.F. Sum of Squares (x) Sum of Product Sum of Errors s Squares Sums (y) Squares of Estimate Mean D.F. Squares Blocks Treatments Error ■7. > 3 7^5-0 253.5 527.5 837.00 323.25 1*51.75 1095.5 klk.5 530.0 li+3.12 8 17.89 Total Tr. + Error 15 12 1526.0 781.O 1612.00 775.00 20^0.0 9kk. 5 175.^5 11 15.95 Test of significanc e for adjusted ■ treatment means 32 . 3? 3 10. 78 1 3-F » 17.89/H D.78 a 1. 66 non- significant Since the total has been broken down into more than two parts, it is necessary to form a new total which contains only the two effects under study, viz., treatment and error. This new total is in the line, treatment + error, in table 9. The degrees of freedom, sums of squares, and products are added to obtain the appropriate numbers. The sums of squares for errors of estimate, S(y - y e ) 2 , are calculated by use of the principle of subtraction, viz., 122 [Sy 2 - (Sy) 2 /uJ - fsxy - (Sx)(Sy) /n] 2 , in the lines for error, arid treatment Sx 2 - (Sx) 2 /K + error. These computations are as follows: •.". (1) Error: 530.0 - (k51.75) 2 /527.5 = 1^3.12 (2) Treatments + Error: 9I&.5 - (775) 2 /f8l = 175. h O The sums of squares for error is subtracted from that for treatment + error to yield the sura of squares appropriate for the test of significance for the adjusted treat- ment means , viz., 175.45 - 143.12 = 32.33. Ill this particular case, the mean square for adjusted treatment means is not signi- ficantly different from error since "dummy" treatments were used. (h) Calculation of the Regression Coefficient The regression coefficient (b) is calculated from the values in table 9« The regression required is the regression of y on x in the row designated error. Since the regression coefficient is the ratio of thus products to the sums of squares of the independent variable , b = Sxy - (Sx)(Sy)/N * ^51 .75 = 0.8564 Sx2 - (Sx) 2 /N 527.50 The significance of the error regression, b = 0.&)6h, should be tested at this point. Unless it is significant, there will be little advantage to use it to reduce the error for the experimental year. The sum of squares due to linear regression will be: (Sx)(Sy)/K| 2 - (kJL. 75) 2 - 386.8 Q£ 33 Sx 2 - (Sx) 2 /W 527.50 The test for significance is summarized in table 10. Table 10. Test of Significance for Error Regression Variation Sum? Mean due to Formulas .. D.F. Squares Square TP Regression [Sxy - (Sx)(Sy)/ft] 2 1 386.88 386.88 2 1.6 3 Sx2 -■ (Sx) 2 /lf rr Deviations from r , v o't ,- . < no regression Sy 2 - J&dg YSSSLzJ^) (g.?lM 2 1^3.1? 17-86 _L ' g J Sx 2 - (Sx) 2 /N Total for Error Sy 2 - (Sy 2 /W 9 530.00 58.89 -Error for adjusted yields. The observed F -value -is highly significant. It indicates that it will be worth while to proceed with the correction of the experimental test data on the basis of their regression on the preliminary yields. ( c ) The Adjusted Treatment Me ans The adjusted treatment means can be calculated and. compared with the unadjust- ed. The formula for the adjusted values is, y e - bx, where y is the individual treat - ' "meet in the experimental year. The computations are given In table 11. The mean yields per treatment of the original data, x and y, are taken directly from table 8. 123 Table 11. Calculation of Mean Yields of Treatments in Experimental Test corrected for Yields in Preliminary Test. Mean Yield Deviations Mean Yield Corrected Yields for Treat- Preliminary from Mean Product 1 Experimental Experimental Test ment Test (x) (x - x) b(x - x) Year (y) y e - "b(x - X) A 105.00 5.00 I* .28 106.75 102. J+7 B • 9^.50 -5.5O -U.71 93.00 97.71 C 102.25 2.25- 1.95 102.25 100.32 B 93.25 -1.75 -1.50 98.OO 99.50 Gen. Mean 100.00 0.00 0.00 100.00. 100.00 h = Sxy - (Sx)(Sy)/N = , 0.85&- Sx 2 - (Sx) 2 /N The regression equation for error for x on y is as fo'lows: y e = y + b (x - x> = 100.0 + 0.856^ (x - 100.00) = 0.Q r )6kx + 1^.36 The graphical representation is shown "below, the points for the determination of the line determined "by substitution in the regression equation. Let x = 92. 5, y e = 95.58. Let x = K>7.5,y Q = 106.^2. 110 Yields in experimental year (y) 90. 95 100 105 110 Yields in preliminary test (x) (d) Standard Error of a Diff erenco The standard error of a given difference between the corrected mean yields is given by Wishart and Sanders (1936) as follows: cr _ ex = /2s 2 * 2 (x- xo ) 2 (11) n A' where s 2 = the variance of the corrected yields (17.89), n =. the number of plots per treatment (h) , A' = the sum of squares for error in the original preliminary trial (527o) , and X]_ and %> = the means of the preliminary treatment plots being compared (105.00 - 9^.50 = 10.50). 124 For treatment A and B, the mean difference of the corrected yields (table 11) is 102. -'+7 - 97.71 = 4.76. The standard error is computed as follows; °"c* = /2(1T.S9) + (17769) (10. 50) 2 = 3-56 ■V * 527.5 ^/^oX = 4.76/3.5o = 1.34, a non-significant value. (e) Factors in Use of Independent Variable The investigator usually wishes to know when it is worthwhile to introduce the independent variable into the experiment. This question is answered by Snedecor (1937) who states that three items will aid him. First, the list of actual and ad- justed means. Sometimes the rank order of adjusted means is quite different from that of the unadjusted and the shifts may be interpreted. Second , a comparison of the sum of squares of errors of estimate (table 8) used to test treatment signifi- cance, 32.33, with Sy2 - (Sy)2/w = klk .5. The latter is far greater' than the former. Third, the change in precision of the experiment due to the adjustment of the error sums of squares. This is indicated in table 8. The sum of squares, Sy2 - (Sy) £ -/w = 530.00 with 9 degrees of freedom, is analyzed into two parts, one with a single de- gree of freedom that measures the variation attributable to regression, the other 8 degrees of freedom being assigned to error. The mean square for error is reduced from 53.59 to 17.89, which is highly significant. These factors will enable the in- vestigator to decide whether to retain the independent variable in similar experiments. It has already been mentioned that the use of preliminary uniformity data to reduce the error in the subsequent experimental test may be useful in perennial crops, but probably is not worth while for annual crops. "References 1. Bartlett, M. S. A Note on the Analysis of Covariance. Jour, Agr. Sci. 26:438-491. 1936. 2. Cox, G. Mc, and Snedecor, G. W. Covariance used to Analyze the Relation between Corn Yields and Acreage. Jour. FarmEcon., l8:597-'607? 193^. 3. Eden, T. Studies in the Yield of Tea. I: The Experimental Errors of Field Experiments with Tea. Jour. Agr. Sci., 21:547-573. I.93.I. 4. Fisher, R. A. Statistical Methods for Research. Workers . Oliver and Boyd, 5th edition, pp. 257-272. 1934. 5. The Design of Experiments, Oliver and Boyd. 2nd Ed., pp. I72-I89 1937 . 6. Garner, F. II . Grantham, J., and Sanders, E. G. The Value of Covariance in Analyzing Field Experimental Data. Jour. Agr. Sci., 24:250-259. 1934. 7. Goulden, C. H, Methods of Statistical Analysis. Burgess Publ . Co., pp. 151-157, and 185-196. 1937. 8. Immer, F. F. A Study of Sampling Technic with Sugar Beets. Jour. Agr, Res., 44:663-647. 1932. 9. , and Raleigh, S. M. Further Studies of Size and Shape of Plot in Relation to Field Experiments with Sugar Beets. Jour. Agr. Res,, 47:591-598. 1933 . 10. Mahoney, C. H., and Baten, V. D, The Use of the Analysis of Covariance and its Limitation in the adjustment of Yields Based upon Stand Irregularities. Jour. Agr. Res., 58:31-7-328. 1939- 11. Sanders, H. G. A Note on the Value of Uniformity Trials for Subsequent Experi- ments. Jour. Agr. Sci., 20:63-73. 1930. 12. Snedecor, G. W, Statistical Methods. Collegiate Press, pp. 219-241. 1937. 125 13. Sumtaerby, P. The Value of Preliminary Uniformity Trials in Increasing the Precision of Field Experiments. McDonald Col. Tech. Bui. 15 . 193^. ll+. Wishart, J. and Sanders, E. G. Principles and Practices of Field Experimentation. Emp. Cotton Growing Corp., pp. 1+5-56. 1935. Questions for Discussion 1. What is covariance? Where useful in experimental work? 2. Interpret the use of the analysis of variance for the determination of signifi- cance of the regression coefficient. 3« How is the error for linear regression computed? h. How can covariance he used on preliminary trial data to reduce the error in the experimental year? 5. Discuss conditions in field experimentation where it might he useful to use preliminary trial data to reduce the error in the subsequent test. 6. What assumption is made in the correction of stand "by covariance? What precau- tions are necessary? 7. Upon what is the error of estimate based? Explain. 8. What does it mean when the difference to adjust treatment means actually is less than the mean square for the error of estimate for error? 9. Name 3 types of agronomic tests where covariance might prove useful. Give the reason in each case. Problems 1. The yields of soybeans in a randomized block experiment with split plots are given below. Let x represent the yield of hay in tons por acre and y represent the yield of seed in bushels per acre. The total yields of the k plots of each spacing are assembled below. Bu. of seed per acre (y) Tons o f h ay per ac re (x) Width Spa cing w: Lthin rows Width of rows Sp f acing within rows of rows 1/2" 1" 2" ■ r Sum 1/2" 1" 3" Sum 16" 89.3 91.8 79.6 ■ 88.6 3^9.8 16" 11.1+0 10.72 9.63 9.68 1+1.1+3 20" 92.7 85.6 37.2 87.I 352.6 20" II.3I 10.06 9.73 9.31 1+0.1+1 2k" 90.6 82.3 8H.3 80.7 337-9 2k n 10.02 9.21 9.00 8. to 36. 61+ 28" 86.0 83.O 82.1+ 78.3 329.7 28" 0.62 9. in °.09 8.28 36.1+3 32" 85.I 78.1+ 7^.6 72.9 311.0 32" 9.53 8.72 8.1+5 7-77 3^.52 1+0" 78A 70.7 71.7 69.2 290.O 1+0" 8.31 8.19 7-3^ 7-59 31.93 Sum 522.6 1+91.8 1+79.8 1+76.8 1971.O Sum 60.7!+ 56.3I+ 53.2^ 51,0> 221.36 The analysis of variance for x and y are given below Variation due Correlation Regression to: D.F. ( y-j) (x-x)(y-y) (x-x) of x and y of y on x Blocks 3 IO.I+370 .II+38 Width of rows 5 182.0500 3.968!+ Error (a) 15 52.55I+2 .J___3 _______ 23 22 5. 01+12 1+. I+655 Spacings 3 5^.7512 2.2108 Width x spacing 15 30.1+300 .2389 Error (b) ___+ 268.6038 1.598^ Total 95 578.8262 ' 8.5115 126 (a) Calculate an analysis of co- variance. (t>) Calculate the correlation coefficient for tho different lines in the 'analysis of variance and co-variance. (c) Do the same for the regression coefficient of y on x. (d) Test the significance of the correlation coefficients 'by means of Fisher's table V.A. Mark the coefficients which exceed the c ) per cent point with one asterisk(-*) and those which exceed the 1 per cent point with two astericks (**) . (e) Test the significance of the regression for error ("b) "by means of an analysis of variance. G-et the -fl? also. (f) Test the significance of the regression for error ("b) "by means of "t" test. (g) Test the significance of the correlation coefficient for error (b) by means of the "t" test ; given by Fisher in section jk of his book. (h) Calculate the mean yield of seed In bushels for the six different width of rows. Calculate the predicted mean yields for each width, using the regres- sion for width of rows. Some data on number of sugar beet plants per plot and yield in tons per acre are given by Snedecor (1937) for a fertilizer experiment conducted in a randomised block test. The data for p replications are given below: Fertilizer No. (x) or Applied Yield (y) None x y Block 1 2 5 I83 176 291 2.M 2.2p k.^Q Treatment Sum Sums Sums S c: uai'O s Product s x 7 356 300 301 6.71 3. Mi- ] +.92 K x. y 22 k p.<-£ 2p8 2kk k.lk 2.32 ?K x y 6 . 34 303 .22 . 1, PIT x y 371 33^ 332 6.kB 7.11 5.88 KN NPK Block Sums x y X y X' y 230 221 3.70 3.2*f 237 2.82 322 367 k0( 6.10 7-^8 7.37 Svan. Squares Sums Products x y Due to the fact that the number of plants varies it is necessary to examine the effect of the variable stand and to estimate the yields on the basis of equal numbers of plants. Calculate as follows: 127 (a) Yield in a simple randomized "block experiment. (t>) The analysis of covariance of stand and yield. (c) Calculate the test for significance. (d) Give the conclusions for the test. 3. A sugar "beet variety test was conducted at Rocky Ford as a randomized block exper- iment in -which tho number of plants differed in each plot. The yields were taken on the basis of competitive plants per plot, and also on the basis of all the beets in the plot . The object of the experiment was to determine the yields of the different varieties. Since the number of beets varies from plot to plot, it is desired to examine the effect of the variable stand and to estimate the yields on the basis of equal numbers of beets. The data follow (unpublished data from 0. W. Deming) : Variety No Yi .00 eld or (y) Block Treatment No. 1 3 4 5 6 Totals 1 X y 243 17.49 217 I8.65 227 1.4.39 210 16 .33 218 11.28 215 15.75 2 X y 245 17.99 217 19.22 239 18.21 210 14.11 205 16.84 219 13.76 3 X y 238 14.13 228 16.62 205 15.99 191 13.81 224 14. 80 211 13.00 4 X y 254 20.19 223 21.45 I89 14.01 180 12.54 23.6 14.20 209 17.65 5 X y 249 20.08 221 17.04 226 14.04 242 16.05 246 13.86 216 9.75 6 X y 225 17.49 212 19.63 194 17.55 211 16.02 202 15.46 215 14.05 Block X Totals y Calculate the analysis of covariance and adjust the yields to a uniform stand basis. FIELD PLOT TECHNIQUE PAST III Field and Other Agronomic Experiments CHAPTER XII SOIL HETEROGENEITY AND ITS MEASUREMENT I. Universality of Soil Heterogeneity One of the difficulties in yield teste is the fact that uniform soil conditions rarely exist, even over a small portion of any field. Soil variability has "been noted "by many investigators, "but it was J. Arthur Harris (1915)(1920) who first pre- sented data to show its extreme importance in field experimentation. Lyon (19H) states that it is "quite likely that productivity of plots change from year to year even with the same treatment", altho the work of Harris and Schofield (1920) (1928) and of Garber, et al. (1926)(1930) indicates a tendency for the differences in plot yields to "be permanent . A soil with differences so slight as to escape the most oh servant eye may have very great effects on' plants which grow in it. Parker (1931) is authority for the state- ment that two plots of the same crop variety grown in "an apparently uniform soil and treated alike in every respect may differ from one another in yield "by 20 per cent or more solely as a result of differences in soil conditions." Small plots have generally replaced large ones to correct for this condition, because it is obvious that two plants of the same variety grown one yard apart are more likely to yield alike than when 200 feet apart as probably would be true of one-acre plots. It is impossible to avoid variation even under such conditions. Davenport and Frazor (I896) report results with 77 variotios of wheat grown on plots two rods square. Nine check plots of the same variety were systematically distributed over the area. The variation in the check plots was so great that only 8 varieties yielded more than the highest check, and but 3 lower than the lowest check. Soils vary in texture, depth, drainage, moisture, and available plant nutrients from yard to yard. After the analyses of large amounts of data from all over the world, Harris (1920) concluded that soil heterogeneity was practically universal. He esti- mated it to be the most potent cause of variation in plot yields and the chief diffi- culty in their interpretation. In 1915 he. stated: "It is obviously idle to conclude from a given experiment that variety 'A 1 yields higher than variety 'B 1 , or that fertilizer 'X' is more effective than fertilizer 'Y 1 , unless the differences found are greater than those which might be expected from differences in the productive capacity of the plots of soils upon which they are grown." Even earlier than this, Piper and Stevenson (1910) remarked that soil variability was so great that "doubt was cast on the greater portion of published field experiments where yield was pri- marily involved." The yield differences must be large enough to overshadow soil variation, or the experiment designed so as to remove its effect. Much of the improvement in experimental methods for field experiments in recent years has been brought about thru special devices to measure much of the soil fertility variation and essentially eliminate it from the actual comparisons being made. II. Uniformity Trial Data Uniformity trial data ha.ve been used for the measurement of soil heterogenity as well as for many other purposes in field experimentation. The usual procedure is to plant a bulk crop, the area being later partitioned into small plots, usually of the same dimensions. The same cultural operations are carried out over the entire area. The yield of each plot is recorded separately at harvest. The usefulness of the uniformity trial lies in the fact that the small units can be combined into larger plots of various sizes and shapes in order to study variability. The variation in 132 yield over the field is due to soil heterogeneity, as well as to plant variation, errors Inveighing, etc., (generally summed up as experimental, error). The; most obvious use for the data is to provide information on the optimum size and shape of plot. Uniformity trial data can also he used to compare the relative efficiencies of different experimental designs, particularly in relation to a certain crop. Data from previous uniformity trials may also he use! to reduce the error of subsequent experiments laid down on the same plots. The method offers promise for perennial crops where the same plants are concerned, but offers little or no advantage for annual crops. A catalogue of uniformity trial data has been published by Cochran (1937) « Some agronomists conduct so-called blank trials (planted to a bulk crop) to observe soil heterogeneity as a preliminary step in experimentation on a new field. Love (1928) advocates such trials, especially as a preliminary to long-time experiments, Th'ey afford an opportunity for the investigator to detect good and poor spots on a field so that unsatisfactory areas may be eliminated. One objection to the blank trial used in this manner is that it takes time. Time may be an Important element .in an experiment . Ill . Criteria for the Measu rement of Soil Tari abi I icy Some accurate measure of soil heterogeneity may be desirable preliminary to seeps for its c c rre ct i on . (a) Correlation Coef f . ; cient Harris (1915) supplied the first quantitative measure baseu on correlation-. his heterogeneity coefficient being an intra-class correlation coefficient, "dor use of the formula,, the field must be planted uniformly to the sains crop and harvested in small units. Harris grouped nearby plots. The number in a. group was arbitrary, it being common to use 2 by 1, 2 by 2, and 2 by 3 -fold groupings. The size of the heterogeneity coefficient is influenced by the size of group. The more p] ots that are put together, the greater is the correlation coefficient. The heterogeneity coefficient is expressed on a relative scale from 0.0 - 1.0 so that comparisons from field to field can be made directly. This coefficient measures the degree to which nearby plots are similar in productivity. Should the correlation be sensibly zero, the Irregularities of the field are not so great as to influence in the same direc- tion the yields of nearby small plots. The higher the correlation, the greater the soil heterogeneity. One may grasp the significance when he remembers that the corre- lation coefficient multiplied by 100 gives the most probable percentage deviation of the yield of an associated plot when the deviation of one plot of the group from the general average is known. Hayes and Garber (IQ27), in explanation, state that in "patchy" fields certain contiguous units tend to yield high while others sh. w a ten- dency in the opposite direction. Under these conditions a high correlation coeffi- cient results. "Where variability is due only to random sampling the correspondence between contiguous plots will be counter -balanced, bv lack of correspondence in others. The same result can be obtained with the ordinary inter- class correlation coefficient as with the heterogeneity coefficient when a 2 by 1-fold, arrangement is used,. The analysis of variance can be used, to obtain the same result as with the hetero- geneity coefficient- as ind.ica.ted by Fisher (195*0 . Intra-class correlation merely measures the relative Importance of two groups of factors that cause variation. In the calculation it is necessary to obtain these equalities: "1? (x - x)2 = ( m - l) n S 2 - - ..---. (l) nS (xb - x)* 2 = (m - 1)b2 I 1 + (n •- 1) r 1 •• (2) 1 L 133 vhere m = the number of arbitrary blocks and n = the number of ultimate units within a block The principal value of the correlation coefficient, either inter-class or intra - class, is to demonstrate that the fertilities of adjacent areas are correlated and that variability exists in the field. (b) Fertility Diagram The suitability of a particular lay-out adopted in an experiment can be judged to a considerable extent by a fertility diagram constructed from the individ- ual plot yields. This is possible from uniformity trial data. An example taken from Crowther and Bartlett (1938) is given in Figure 1. Figure 1 Variation in natural fertility at Bahtim, 193 ,+ (yield in kantars per f eddan) . w IV . Computation of Heter ogeneity by the Analysis of Variance Suppose a field J a divided into K email plots* all sown to the same variety. Some uniformity trial data from Mercer and Fall (1911) on the grain yields of one acre of wheat when harvested in 1/500-acre plots will be used to illustrate the method of computation. The yields in pounds per plot for the 24 plots in the northwest corner are as follows : 3.63 ffl] 4. 15 i 1 1 4.06 Hl>, 5.13 4. 07 4.21 : 4.13 4.64 4.51 ffig 4.29 I 4.4o HU-, 4 . 69 3.90 4. 64 i 4.05 4.04 3.63 m 3 4.27 : 1 1 4 , 92 m,< 4.64 3.1b 3.55 ; 4.08 4.73 It is noted that the area is divided into an arbitrary number of blocks all equal in size, i.e., there are 6 blocks with 4 ultimate plots in each. Let x = the value of an ultimate plot unit. I\T = the total number of plots = 24. S(x.) = sum of all the ultimate plots - 101.34. X - mean yield of the ultimate plots - 4.2308. 3(x 2 ) = sum of squares of yields of the ultimate plots = 434-. 5582. Then, the sums of squares are computed as follows: Total = S(x 2 ) - (J3x)f : = 434.3382 - 422.5933 = 4.9394 I Between blocks = S( x £) -(Sx) 2 ± 1?'2T. 94-10 - 429.59-38 - 2.3864 — : g; jj Within blocks = 4.9594 - 2.3864 ^ 2.5730 The analysis of variance is as follows: Variation D.F. Sums Squares Between blocks 5 2.3864 = (m-l)s 2 [ ]. j- (n - 1) rj Within blocks 13 2.5730 - (m--l)s 2 (n - l) (1 - r) Total 25 4.9594 = (m-l)ns 2 Wow, let m = the number of blocks, m-1 = the degrees of freedom for blocks, n - the number of plots per block, and. s' c - = the estimated variance. Then m = 6, m - I = 5> and n = 4. . ' Since (m-l)ns 2 = 4.9594, 20s 2 = 4.9594, and s 2 = 0.2480 From the formula for the sum of squares between blocks, 155 (m - l)s 2 [l + (n.- 1) r] * 2.386^ or (5)(0.2480)(1 + 3r) = 2.386^ Then r = O.3082 V. Amount of Soil Heterogeneit y > In his studios of soil heterogeneity, Harris (1915) (1920) used fields planted to the same crppj "but harvested in separate small plot units. The relative productivity of contiguous plots was determined. (a) Variations in Yield in Same Sea son Some of the results obtained by Harris are given by Hayes and Garber (1927): Plot Size Investigator r Crop Characters Wheat Grain Yield N -Content Oats Grain Yield Mangels Boots Loaves Potatoes . Tuber Yield Corn Grain Yield 5-5 by 5.5 ft. Montgomery O.605 ± 0.029 0.115 ± O.oMf 1/30 acre Eiesselbach OA95 ± O.O55 1/200 acre Mercer & Hall 0.3^6 ± 0.037 O.U66 ± 0.0U3 12 -foot row Lyon O.3H f 0.0^3 0.03;^ acre Smith O.83O i 0.019 The amount of soil heterogeneity in rod-row trials was measured by Hayes (1925) at the Minnesota Station in connection with a variety test. Four systematically distri- buted plots were used. To obtain the heterogeneity coefficient the average yield of each strain in the trial was considered as 100. The yielding ability of each plot was obtained by dividing its actual yield by the average yield of all four replicates and expressing the result in percentage. By the ordinary method, correlations in yielding ability of adjacent plots or of plots at any distance apart were determined, The results were as follows for oats ; , spring wheat, and winter wheat: Correlation Coefficients (r) Factors Correlated Pi 1 ^ Spring Wheat Winter Wheat Adjacent plots 0.572 ± 0.025 0.6l8 ± 0.023 0.552 ± 0.063 Separated by one plot 0A90 - 0.029 0.518 ± 0.028 O.293 ± 0.028 Separated by four plots 0.26^ ± O.oUl .kk-9 - 0.03^ O.llU * 0.118. Separated by ten plots 0.275 - 0.057 0A29 ± 0.060 - --- The correlation coefficient explains very little unless one knows the factors in- volved. However, it affords the best means to consider the amount of replication that should be practiced. Similar results were obtained by Garber, Hoover, and Mcllvaine (1926) in West Vir- ginia experiments. They found a marked correlation between the yields of oat hay in contiguous plots. The correlation for the yields of replicated plots was sensibly zero . (b) Permanence of Differences It is important to know whether or not there- is a tendency for plots that produce low yields one season to produce low yields the next season, etc. The re- sults of Harris and Scofield (1920) indicate a tendency for plots to yield in a similar manner from year to year, altho there are some exceptions. Their data for inter-annual correlations for hop yields are as follows: 13 6 series 1909 1910 1911 1912 1st and 2nd Years 1st and 3rd 1st and ^th 1st and 5th 1st and. 6th Years Years Years Years 0.580 O.768 i 0.051 0.662 - 0.07 0.577 ± o.o32 0.447 t 0.099 . 0.451 - 0.098 0.27^ 0.3.15 * 0.111 0.126 $ 0.121 0.105 0.259 *0..115 O.IIJ4 0.061 - 0.12*; 0.062 x 0.123 0.511 i 0.111 0.703 - 0.062 0.597 ± 0.079 the result: 5 -year study on In a later paper, Harris and Schofield (.1928) gt a uniform cropping experiment at Hunt ley, Montana. In general, a positive correla- tion 'between the yields of a series of plots was found thruout a period of years. The plots which show a heavier yield one year will in general show heavier yields In other years during the perior under investigation. Under some conditions negative correlations -were found which were interpreted as indicating the importance of a preceding crop in determining the characteristics' of an experimental field. Garber, et al, (1926) found some tendency for plots which produced relatively high yields of oat hay in IO23 to produce relatively high yields of -wheat grain in 192''-. The correlation coefficient for th ; two was 0.564 * 0.056. The study was con- tinued by Garter and Hoover (1930) to determine whether or net the natural variation in noil productivity among plots as revealed "by a crop uniformity test persisted af- ter an experiment is started that involves different crops and different soil treat- ments. They correlated the relative yields from duplicate oat plots in 1923 and the relative average yields from, the same duplicate plots of other crops in a rotation experiment from 1924 to 1029 (incl,). The data were as follows: 'r ;i value between elds .n 1 Op". > and 1924 1925 1926 1927 1928 "i 000 130 150 1.26 128 120 15o O.38 i 0.05 0.35 * 0.05 0.48 * 0.05 0.41 * 0.05 0.42 ± 0.05 C.27 t 0,03 age 1923 to 1929 (incl.) 60 ± 'hese correlation coefficients were all statistically significant, and the d3.nerences m naturax this ce.se) even though the :.ncLieaec that proauctxvrcy may persist ov^r a soil be subjected to different treatments riod of years (five, T. Cant of He^eroffei Many fact or s may contribute to soil variation. Yields P. i*1 1 1 "•" !~YT\ *"\ ' l"T»W TvH *~'f raoii has demonstrated by the use of the correlation cc geneity is sufficient to influence experimental r moiseuro, ana s< If let erf :ults. crops from foot to ; ma;; >il fertility. ■ris (1020) that nibstratum netero- {&) Soil Topog; ?he topography oJ may direc Indirectly influence the variation in soil productivity. Steep hillsides are unadapted to e rains gully the field and carry the fertilisers from plot is apt to pond on certain area.s and Influence crop yields troduced l>:j variation in the subsoil. For example, there are gravel pockets in the subsoil on the Judith Basin (Montana) field station. perimentation because heavy to plot . Moreover, water Sometimes errors are in- 137 (b) Soil Moistur e The water content of soil was studied by Harris (19-15) on the U.S.D.A. Exper- imental Farm at San Antonio (Texas) . He took "borings 6 feet deep at 20-foot inter- vals on a field 150 "by 26h feet in size. The coefficients ranged from r = +0.32 to 0.70, being statistically significant for each foot section of the upper 6 feet of soil. (c) Fertility Elements The carbon and nitrogen content of soils was studied by Harris (1915) at Davis, California. The heterogeneity coefficient for carbon was O.U17 * O.063, while that for nitrogen was 0.^9^ - 0.057 • On blow sands at Oakley, the r-value for carbon was O.3I7 ± 0.068, and that for nitrogen was O.230 ± 0.072. Wide fluctuations in nitrate nitrogen were reported by Blaney and Smith (1931) on l/30 acre plots. They found that the probable error was usually greater than 5 P e ^ cent where less than 20 soil cores were considered. In fact, they recommended 50 soil samples on a 1/30- acre plot to reduce the error to approximately 5 V 0T cent of the mean. When soils outside of Rhode Island were considered, they found that 6 to 8l samples were neces- sary to obtain a probable error that low. Some Colorado Station data show extremely wide fluctuations in p. p.m. nitrate nitrogen on an irrigated soil. A 13 by 10-foot plot was sampled in 5 places to a depth of 6 feet. The nitrate nitrogen varied from 5 to 35 p. p.m. on this small area. It is obvious that variations in nitjrate nitrogen can cause yield differences from area to area. VII. Corrections for Soil Varia bility Once soil heterogeneity is recognized, some means must be obtained to avoid or cor- rect its influence in field experiments. A decrease in size of plots and an increase in the number of replications (as will be shown later) has been the general practice to overcome soil variation. The repetition of .plots of varieties or treatments to be tested against each other are scatter*! out so that they may sample the different conditions of the trial area. One variety, for . instance, may be grown partly on_ favorable portions and partly on less favorable portions. This usually means that the variety encounters somewhere near e-verage soil conditions. Efficient experimen- tal designs provide for the removal of a portion of variability due to soil. Arti- ficially constructed field plots were studied by Garber and Pierre (1933) over a 3- year period. They found that soil heterogeneity was largely removed. by a thorough mixture of soil placed in 30 artificial bins. These soil bins were 9 feet k inches by k feet 8 inches (inside area) by 2k inches, in height, and ye^e 0.001-acro in area. They obtained a probable error of a single determination in per cent of the mean of 3A for soybean hay, and 6.2 for wheat.- They found, however, that the variation in crops was still too high to make replication unnecessary. VIII. Relation to the Experimental Field Many early experimental fields were poorly selected because of the belief that an experimental farm should contain many different soil types, i.e., the soil should be extremely heterogenous. The Ohio Experiment Station was allowed to relocate after the first 10 years due to the poor choice of the original site. (Thorne, 1909)- For all ordinary field experiments the land should be as uniform as possible in regard to topography, fertility, subsoil, and previous soil management. However, extreme uni- formity may defeat the purpose of the investigator unless such soil is representative of the area for which the results are to apply. (a) Topography A perfectly level piece of land is as undesirable for field experiments as one with surface inequalities because water may pond on it. A slope of 1 or 2 per cent will permit water from heavy rains to flow off uniformly and completely. A 138 slight slope is highly desirable on land to be irrigated. Some experimenters use low land or "draws" and irregular areas for bulk crops or for seed increase plots. (b) Previou s Soil Treatment It is desirable to have soils which have had uniform previous treatment be- cause there may be a carry-over effect of previous treatments. According to the ' American Society of Agronomy standards (1933): "When a field or series of plots has been occupied by varietal or cultural tests of such a nature as to seriously increase soil variability, one or more uniform croppings should intervene (or follow) before it is again used for such tests. It is frequently helpful to arrange the plots at right angles to the direction of the previous plots." (c) Subsoil Conditi ons When it is necessary to drain lands in the humid regions, the tile lines should be located so as to influence all plots alike. They should run across the plots rather than with them. In the case of soil fertility experiments, it is recom- mended that a soil profile be taken to a depth of 3 feet for each series of plots. Before soil treatment experiments are begun, representative samples of the soil and subsoil should be carefully taken for such analyses as may be desired for future reference. References 1. Blaney, J. E., and Smith, J. B. Sampling Market Garden Soil3 for Nitrates. Soil Sci., 31:281-290. 1931. 2. Cochran, W. G. A Catalogue of Uniformity Trial Data. Suppl . Jour. Roy. Stat. Soc, 4:233-233. I937V 3. Crowthcr, F.,, and Bartlett, M. S, Experimental and Statistical Technique of Some Complex Cotton Experiments. Erap. Jour. Exp. Agr. 6:33-68. 1938. 4. Davenport, E., and Frazer, V. J. Experiments with Wheat, I&88-I895. 111. Agr. Exp. Sta. Bui. 4l, pp. 153-153. 1896. 3. Fisher, E. A. Statistical Methods for Research Workers. Oliver and Boyd. 5th Ed. pp. 210-214. 1934. 6. Garber, R. J., et al. A Study of Soil Heterogeneity in Experiment Plots. Jour. Agr. Res., 33:255-2.68. 1926. 7. Garber , R. J., and Hoover, M, M. Persistence of Soil Differences with Respect to Productivity. Jour. Am. Soc. Agron., 22:883-390. 1930. 8. and Pierre, W. II. Variation of Yields Obtained in Small Artifi- cially Constructed Field Plots . Jour. Am. Soc. Agron., 2^:98-105. 1953* 9. Harris, J. Arthur. On a Criterion of Substratum Homogeneity (or Heterogeneity) in Field Experiments. Am, Nat., 49:450-454. 1915 . 10. . Practical Universality of Field Heterogeneity as a Factor Influencing Plot Yields. Jour. Agr. Res., 19:279-314. 1920. 11. , and Schof ield, C. S. Permanence of Differences in the Plots of an Experimental Field. Jour. Agr. Res., 20:333-356. 1920. 12 . , , , . Further Studies 01a the Permanence of Differences in the Plots of an Experimental. Field. Join-. Agr; Res., 56:15-41. 1928. 13. Hayes, K. K., and Garber, R. J. Breeding Crop Plants. McGraw-Hill, Pp. 56-69. 1927 . 14. . Control of Soil Heterogeneity and Use of the Probable Error Concept in Plant Breeding Studies. Minn. Agr. Exp, Sta. Tech. Bui. 30. 1925- 15, Lovo, H. H. Planning the Plat Experiment. Jour. Am, Soc. Agron., 20:426-432. 1928. 16. Lyon, T. L. Some Experiments to Estimate Errors in Field Plot Tests. Proc. Am. Soc. Agron., 3: 89-114. 1911 . 139 17. Mercer, W. B., and Hall, A. D. Experimental Error of Field Trials. Jour. Agr. Sci., k: 107-127. 1-9H. 18. Parker, W. H. Methods Employed in Variety Trials "by the National Institute of Agricultural Botany. Jour. Natl. Inst. Agr. Bot., 3:5-22. 1931. 19. Piper, C. "V., and Stevenson, W. H. Standardization of Field Experimental Methods in Agronomy. Proc. Am. Soc. Agr on., 2:70-76. 1910. 20. Standards for the Conduct and Interpretation of Field and Lysimeter Experiments. Jour. Am. Soc. Agron., 25:803-828. 1933. 21. Thorne, C. E. Essentials of Successful Field Experimentation. Ohio Agr. Exp. Sta. Cir. 96. 1909 . 22. Wishart, J., and Sanders, H. G. Principles and Practice of Field Experimentation, Emp. Cotton Growing Corp., pp. 7-8, and 60-6y. 1935. Que stion s for Discussi on 1. Discuss how soil heterogeneity might influence yield trials. 2. Why and how may small plots overcome the influence of soil variation? 3. What is a uniformity trial? How conducted? k. What uses can be made of uniformity trial data? 5. How can correlation "be used to measure soil heterogeneity? 6. Fundamentally, what is the so-called "heterogeneity coefficient" used "by J. Arthur Harris? How interpreted? 7. What evidence did Harris have that soil heterogeneity was universal? 8. What general results were obtained at the Minnesota Station when the yields of adjacent plots were correlated? Those separated "by other plots? 9. Are differences in the productivity of plots constant from year to year? Explain. 10. How may soil topography, moisture, and nitrogen account for soil heterogeneity? 11. What corrections can he used for soil variability? 12. Would artificial soil bins do away with the need for replication? Explain. 13. What precautions should be taken in the selection of an experimental field? 1*. Is extremely uniform soil always desirable for experimental work? Explain. 15. What is the value of a bulk crop preceding an experiment? 16. To what use would you put uneven and low land in an experimental field? Why? Probl ems One acre was planted uniformly to the same variety of wheat and harvested in units l/500-acre in size. (Data from Mercer and Hall). The 16 plots in the southwest corner of the acre gave yields in pounds as follows : 3.87 *1-.2I 3.68 *K 06 3.76 3.69 5.8* 3.6? 3.91 k.y h,2l V.19 3.5^ 3-59 3.76 3.30 (a) Calculate the correlation coefficient by the analysis of variance for a 1 by 2 -fold arrangement. (b) Calculate the simple correlation coefficient for the same paired values. 2. Some unpublished data from the Akron Field Station give the average yields of corn and oats (combined) in bushels for a particular piece of land for 20 years as follows: .uo North Totals 1-13.9 39-8 ko.6 39.4 165.7 39.0 kO.k 37-7 35.0 152.1 32.9 kO . 5 39-5 36.7 149.6 yd. 6 1+1.3 41.1 35.5 156.5 35. 4 '13 . 1 37.3 30.5 146.3 West Totals I89.8 205.1 196.2 177.1 (a) Calculate the correlation coefficient by the analysis of variance to determine the heterogeneity from north to south, i.e., for a 1 "by k combination. (b) What is the correlation coefficient for a west to east direction? Calculate r for 1 by 5 combinations. .(c) In what direction is the soil most variable? Why? (d) Assume that the yield for each plot is 40 bushels in the above problem 2. Calculate the correlation coefficient by the analysis of variance. 5- Some yields of wheat plots of a single variety grown in 10 by 10-foot plots were as follows: (Pat a from Montgomery) . 67 So 70 76 76 09 59 74 73 71 00 71 61 60 67 65 66 77 79 77 66 57 62 72 54 61 6k 80 76 76 64 68 65 67 76 fl 68 72 77 or. lh 58 62 68 77 7H 77 70 65 oc 58 60 71 64 70 6k 65 57 75 Ik 73 63 62 69 86 6k 66 61 62 62 57 37 56 65 69 6h 63 57 61 65 58 ^ 73 73 71 73 71+ 66 67 (9 59 53 60 78 78 73 72 73 60 7>+ Calculate the heterogeneity coefficient (intra-class correlation) by the analysis ef varian.ce for a 2 by 1-fold combination (2 horizontal rows and 1 vertical row) . CHAPTER XIII SIZE, SHAPE AND NATURE OF PLOTS I. Early Use of Field Plots Modern field experiments "began in 183^ when Jean Boussingault started a series of tests on his farm near Bechelbronne in Alsace. Early agriculture investigators favored large plots because of their attempts to conduct field trials in essentially the same manner as the farmer handled his crops. ' The size of plot was considered at the Virginia Experiment Station as early as 1890 by Alwood and Price (1890) who suggested that, within limits, the larger the. plot the more reliable the results. However, they conceded that small plots were sufficiently accurate for preliminary trials and for obtaining information on earliness and gener- al quality of varieties. Taylor (I908) found a wide variation in size of plots used in this country in 1908. They varied from two acres in a Georgia cotton experiment to l/^0 -acre in size, with all sizes between the two extremes. The average size of plot in America at that time was l/lO-acre . The size of plots in relation to the experimental error was first studied at the Rothamsted Experimental Station in 1910 by Mercer and Hall (1911). As a result of their work and that carried on subsequently by others, the trend has been toward smaller plots and increased replication. A questionnaire, sent out by the Committee on the Standardization of Field Experiments of the American Society of Agronomy in 1913, reflected this tendency. The plot sizes used by different agronomists /aried in size from one acre to l/200-acre, with very few using plots larger than l/lO-acre or less than l/80-acre. At the present time, plot sizes vary from l/lO to l/lOOO-acre in size. The basis for the smaller plots with increased replication has been data from various blank or uni- formity trials conducted by Mercer and Hall (1911), Bay (1920), Summerby (1925), McClelland (1926), Wiebe (1935), Smith (1938) and many others. The catalogue by Cochran (1937) should be consulted for uniformity trials with specific crops. • A — Size and Shape of Plots II. Factors that Influence Plot Size There are several factors to consider in plot size aside from the accuracy of the results. Some of these are: Kind of crop, number of varieties or treatments, kind of machinery to be used on them, and the amount of land, labor, and funds available for the tests. (1) Kind of Crop : It is the general practice to use larger plots for corn, sugar beets, and the forage plants than for small grains. The plots must be large enough to carry a representative population of the crop involved. (2) Numb er of Vari e ties or Treatments : Small plots are a necessity when large numbers of varie- ties or strains are in various testing stages. In small grains, it is not uncommon to have from 500 to 20,000 strains in the various stages of a breeding program. (3) Amount of Seed : In the early years of selection in small grains and in many other plants, only a very small amount of seed is usually available. Obviously, the plots must not be too large for the seed supply, (k) Kind of Machinery : The area and shape of field plots should be such as to enable the operation of standard farm machinery and to reduce to a reasonable minimum the errors concerned therewith. Larger plots are necessary when the crop is planted, cultivated, and harvested with standard farm machinery than where hand methods are used. (5) Land Area: For a given area of land, the plot size varies inversely with the number of varieties or treatments to be included. This is true until the minimum practical size is reached. As a result, to quote Goulden (1929)'. "The general practice is to use quite small plots adequately replicated for strain tests, i.e.. when there are a large number of varietal units, and larger plots when the number of varieties is small enough to per- mit their use with the amount of land available." (6) Funds Available : In general, it is more costly bo use large plots than siaa.ll plots. III. Kinds of Experimental Pl ots It is necessary to distinguish between nursery and field plots more or less arbitrari- ly. Nursery plots are usually • small plots cared for by hand while field plots are- larger' and adapted to the use of standard farm machinery. The present tendency is to reduce the size of field plots and to enlarge nursery plots from single to multiple short rows (rod-rows in many cases) . (a) Nursery Plots Nursery plots may be as small as one square yard in area, but the rod-row is probably the most common unit size. Small plots allow the preliminary testing of many strains. However, uniform soil and careful t'echnic is vital to accuracy for small plots. Taylor (I908) points out that small mistakes on small plots may greatly modify the results. For example, an error of 5 pounds on a 1/20-acre plot would mean an error of 100 pounds on an acre basis. The rod-row unit has been widely used in this country for small grain trials while the chessboard plot has been used in Eng- land. Engledow and Yule (I.926) describe the latter as. being one yard square with the crop space-planted at 2 by 6 inches. The principal objection to the chessboard is the amount of detailed hand labor involved and the fact that it affords less oppor- tunity to observe strength of straw, evenness of germination, etc. As plant individ- uality must be considered in row crops, there is some variation in type of nursery plots. (b) Field Plo ts For standard farm machinery, field. plots usually vary from 1/10 to 1/100 acre in size. They offer more opportunity to observe crop behavior under conditions com- parable to those found on the farm. Field plots are used for variety tests, crop rotation experiments, fertilizer trials, forage experiments, pasture experiments, irrigation studies, cultural trials, etc. 0rd.ina.rily, such plots are long and narrow in shape as most convenient for farm machinery. ( c ) Co mparison of Nurse ry vs. Fi e Id PI ots The use of small hand-sown nursery plots to test yields of agricultural crop varieties has been frequently criticised, on the ground that such plots do not repre- sent normal agricultural conditions. In general, small plots have been found to com- pare favorably with large field plots in accuracy so long as adequate precautions have been taken against competition and. other errors. There is further evidence that nursery plots give results that are valid when applied to agricultural practice. As early as 1910, Lyon (I9IO) reported., a comparison of seven l/l0-acre plots with seven groups of 10-row plots "I" feet long. The probable errors were p.09 and. k.k'-);. respectively. Moreover, less land was required for the small plots. Seven l/lO-acre plots covered an airea of 30, 1 +92 square feet,, while 70 of the 17-foot rows required only 1,190 square feet in area. A general correspondence of rod-rows and. field plots has been shown by Klages (1933) for 11 to 1^ varieties of spring wheat, 7 varieties of durum wheat, 12 to 15 varieties of oats, 13 to 20 varieties of barley, and 7 varieties of flax in each of k years. He calculated the correlation, coefficients (r) for the two sets of plots. Hayes and others (1932) compared, the yields of 16 wheat varieties sown in rod rows by 1^3 hand and by a drill at different rates with those aown by a farm drill in l/UO-acre plots. . The correlation coefficients indicate some agreement between the yields ob- tained from the small and large plots. Smith (1936) has criticized the correlation coefficient as inefficient in such comparisons: "If real differences between varie- ties were either small or non-existent, then the correlation coefficients would be zero or insignificant, altho the trials might agree in showing no significant differ- ences between them. On the other hand, the correlation coefficient could not become unity unless experimental error could be entirely eliminated. Consequently, r may vary from to + 1 even while the two forms of trial are in perfect agreement." In a study of 12 timothy varieties, Smith and Myers (193*0 showed that the yields from rod-rows and l/50-acre field plots agreed to precisely the degree required by statistical theory. Smith (193^) later compared 9 wheat varieties sown by a farm drill in l/lOO-acre plots and dibbed in square yard plots. Agreement of the two ex- periments was excellent with respect to yield of grain. Tysdal and Kiesselbach (1939) compared 2 varieties of alfalfa in l/30-acre, field plots with various l6-foot nursery plots which differed as to number of rows and spacing. They combined the forage yields into a single analysis of variance from which they concluded that the several types of nursery plots gave essentially the same yields of the two varieties as did the field plots. The interaction of varieties x type of plot was not significant. The problem resolves itself into whether small nursery plots with more pre- cise control of soil heterogeneity will give the same results as large field plots with less control of soil variability. The sacrifice in plot (sample) size must be balanced l>j more effective control of soil heterogeneity for the small nursery plot to be ae satisfactory as the large field plot. This can be brought about to some ex- tent by increased replication of small plots. IV. Relation of Plot Size to Accuracy In general, it has been found that the variability is decreased as the plot is in- creased in size up to about l/lO-acre. However, the variability is less when a unit of a certain area is made up of several distributed units than when a single large unit is used. In a theoretical discussion, Siao (1935) states "Increasing the size of plot decreases the variability of the experiment by increasing the precision of a single plot yield. On the other hand, there is an increase in the variability within the block through expanding the area included in the block. There are two opposing tendencies that affect the experimental error as the plot changes in size, the final result being due to a balance between these two tendencies. The slow rate of reduc- tion in experimental error through increase in size of plot and, in exceptional cases, the greater variability for larger plots, may be explained by increase in variation within the block as the plot increases in size." The work of Stadler (1921) and Wiebe (1935) indicates that the total variation tends to increase as more land is added to the experimental area, provided the size and shape of the ultimate unite remains the same. It should be emphasized that plot size varies with the conditions of the experiment, there being no one size best for all crops on all soils. Compara- tive studies on plot size have been carried out in most instances on blank or unifor- mity tests. After optimum plot size has been determined, the standard error per plot and the number of replications to reach a given degree of accuracy in the comparison of the mean treatment yields is usually computed. Typical investigations on plot size will be considered for small grains and for other crops separately. (a) Small Grain Plots Much of the earlier work was conducted with small grains . The conclusions applicable to one are generally .applicable to the others. Mercer and Hall (19H) used uniformity trial data for an acre of wheat, the field being divided into 500 small plots each of which was harvested, separately. Adjacent plots were grouped so Ikk as to form plots of different sizes. The standard deviations in per cent p or the 1/500, 1/250, 1/125, l/ l 30, 1/25, and l/lO.-acre plots were 11.-6., 10.0, 8.0, 6.5., p,7, and 5-l.j respectively. The standard deviation was reduced as the plots were ioade larger, hut the increase in plot size above l/50-acre produced a relatively small decrease in variability. These investigators found that precision was increased more rapidly by replication. When five scattered l/300»acre plots were combined so as to give a total area of l/lOO-acre the standard deviation in per cent of the mean was reduced to k.S per cent. Olmstead (IQl'-t-) found with wheat that a number of small plots ranging down to 0.0007-acre in size is much better than the same total area in one plot, and also that one large plot is more accurate than one small one. In wheat studies, Day (1920) found that the probable error decreased with an increase in plot size up to l/20-acre. From a blank .i.ax with wheat, Smith (193$) concluded that the reduction in variability by increasing plot size equivalent random renlication. less than could 'be obtained by Hayes (I923) compared 16 and 32 -foot rows of wheat, oats, and barley. He failed to find a significant difference in favor of 32-foot rows. A comparison of one and two rod-rows plots indicated little advantage for the harvest of two rod-row per plot over one. Stadler (1921) obtained data on three and five-row plots, the border rows being discarded. His results follow: Crop Ho. Plots Coefficient of Variability One Central How Three Central Row Barley Oats Wheat 21 20 On 2^.80 oh P-o 27.68 22.13 22.59 25.ll Summerby (1925) found very little difference in accuracy between large and small plots when eight replications were used. His oat plots were 1,2,^,8,16, and. 32 rows in width, spaced one foot, and 15 feet long. Love end. Craig (1938) made an analysis of data from 2 oat crops for various types of plots and various numbers of replica- tions, and for rows 15 and 30 feet in length. The data indicate that ^-row plots with several replications (8 or 10), when all rows are harvested, give accurate re- sults. They are preferred to single -row plots. The 15-foot rows were considered more satisfactory than those 30 feet in length. Such data as these support the wide- spread practice of using three rod-row plots for small grain nursery trials with the center row harvested for yield. (b) Other Crops Different crop plants are known to differ in variability. The coefficients of variability for different crops were compared by Smith (1933) for a standard l/^-0-acre plot from the published data for 39 uniformity trials. The crops fell roughly into 3 groups: (I) wheat, mangolds, sugar beets, soybeans, and sorghums (forage) seem to be less variable: (2) corn, potatoes, cotton, and natural pasture were intermediate; and (3) fruit trees were most variable. In the ca.se of corn, Bryan (1933) reports that "variability of plot yields decreased as the size of plots increased from 3 to lb, to 2k } and to kS hills, but the decrease was not proportional to the size of plot , The experimental error for a given area, therefore, would be lower with larger numbers of small plots." McClel- land (1926) obtained a similar reduction in error as the size of plot was increased, the error being 11.2 per cent for l/80-acre plots, and 6.2 per cent for those 1/2 acre in size. With sorghums, Stephens and v r inall (1928) concluded that the errors decrease with an increase in plot size up to l/20-acre. Increasing the plot from l/oOO-acre 1*5 to l/20-acre, with the same total area concerned, reduced the probable error about 60 per cent. The standard error was found "by Immer (1933) to "be actually reduced in sugar "beet plots when the plot size was increased from one to two rows in width, or for an increase in length from two to four rods. However, efficiency in the use of land de- creased as the size of plot was increased. Some of his data for the harvest of the entire plot are as follows : Length Plot Percentage Efficiency of Plots of Indicated Width (Rows) in Pods 1 2 3 ^ 6 12 2 100.0 88.0 77.7 53.3 3^-9 21. h k 76.2 62.5 kB.2 35.2 21.2 28.8 10 50.0 37.6 28.6 26.1 10.2 9A 20 35.1 2*. 5 21.6 10.1 5.8 6.7 Similar results were reported by Immer and Raleigh (1933)« Uniformity trial data with soybeans, computed by Odland and Garber (1928), indicate that 16-foot plots in single rows replicated three times were the most satis- factory when both accuracy in results and land economy were taken into account. Vest over (192*0 experimented with 220 rows of potatoes, 150 feet long. He harvested them in 10-foot lengths, and found a sharp reduction in probable error between row lengths of 10 and Uo feet. Beyond 60-foct lengths, there was very little reduction in error. Ligon (1930) found no necessity for rows greater than 100 feet in length for cotton, the shorter rows being just as accurate when sufficiently replicated. Unit rows of cotton 2k feet long and spaced one foot apart were used by Siao (1935) in studies on size of plot for cotton. When combined into plot sizes of 1,2, 3 } k, and 3 rows, the efficiency was greatest for the smallest plot. In plot size studies with millet, Li and others (193&) concluded that plots 15 feet long and two rows wide were the most efficient, i.e., 113-9 per eent compared to 100 per cent for 15-foot plots one row wide. Batchelor and Reed (1918) studied the variability of orchard plot yields from the standpoint of increasing the number of adjacent trees per plot. The average re- duction in variability for all fruits was 37.78 to 2^.27 per cent when the plot was increased from one to eight trees, but little was gained by including 16 to 2k trees per plot . The reasons for variability in small plots may be summarized as follows: (1) Variability in soil, (2) losses in harvest and errors in measurement have a rela- tively great effect, (3) in row crops, plant variability may be important because of fewer plants, (k) competition and border effects are apt to be greater on small plots. V. Plot Sizes for Various Crops The plot sizes depend upon the crop plant, and upon the conditions under which the test is conducted. (1) Small Grains : The majority of experiment stations use three- row plots with the center row harvested for yield, but a few use five-row plots with the center three rows used for the yield determination. A few use single-rod-row plots. (2) Corn : The Nebraska station uses four-row plots, 12 hills long, harvesting the center rows for yield. Others use single rows about 20 hills long, or three rows 146 of the same length with only the center one harvested for yield. Bryan (1933) re ~ ports that,, in a comparison of open-pollinated varieties and hybrids,, equal degrees of precision were attained with about half as many plants or hills of crosses as of open-pollinated varieties. He found, that '43 total hills were sufficient to represent a variety. (3) Soybeans : Soybeans may be grown in rows lo feet long and 30 to 32 inches apart. Field plots are often employed, (4) Sorghu ms : The work of Stephens and Vinall (1923). indicates that "three 02' four replications of l/40-acre or l/80- acre plots will give results sufficiently reliable for the ordinary sorghum test". Slightly larger plots are advocated by Swans on (1930). When protected by borders, 2 and 4-row plots 8 rods long having an area from l/pO to I/25-acre, are regarded as convenient units. At Kansas, four --row plots about 100 feet long are used. The grain sorghums are thinned to eight inches in the row., while the forage sorghums are spaced four inches in the row. The rows are spaced the same distances apart as for corn. (5) Alfalfa and Clove rs: These crops are usually grown in field plots about seven feet wide and 60 feet or longer in length, with the center five feet harvested with a mower. Tysdal and Kiesselbach (1939) state that the most serviceable types of plot for advanced nrrsery testing appear somewhat optional among these: (a) Solid-drilled 5 to 8 rows sioaced 7 inches apart with a 12 to 14-inch alley between border rows; or (b.) . solid-drilled 3 bo 5 rows spaced 12 inches apart with an 13-inch alley between border rows. The entire plot may be harvested since very little error due to border effect occurs. (c) Single rows spaced 18 to 24 inches apart are permissibile for preliminary nursery tests. (6) Sugar Be ets: Immer (1932) states that four-row plots are the most efficient. The rows should be two to four rods long, spaced 20 to 22 inches apart, and the plants thinned to about 12 inches in the row. VI . Re lation of Sh ape _to Bel lability Some investigators have found that long narrow plots best overcome the effects of soil heterogeneity, while others believe that plots should be approximately square. For example, Barber (1914) reported that a small square plot affords a more accurate basis for variety comparisons than a long narrow plot that has extra growth along the borders when alleys exist between the plots. On the other hand, Kiesselbach (1918) showed, that the coefficient of variability for l/lO-acre oat plots 43 rods by 5*5 feet was 3«84 per cent, as compared with p.lS'per cent for plots lo rods by I0.3 feet. Justessn (.1932) found long narrow plots to be more efficient than the shorter plots of the same area. Mercer and Hall ( 1911 ) divided the plots of a single variety into plots of equal area but of different shapes. The dimensions were 20 by 12, and ;30 by 5 yards. They found no significant difference in variability between them. Similar results were obtained by Stephens and Vinall (1928) with sorghums. Bryan (1930)* in his work with various shapes of corn plots, concluded that shape is less important as the size of plot is reduced;. With plots as small as lo hills, either single, two or four-row plots may be expected to give similar results. These, apparent inconsistencies are explained in the work of Day (1920) who harvested., in five-foot sections, a l/40-acre area uniformly cropped to wheat and combined the ultimate units to form plots of various shapes. He found that plots with- their greatest dimensions in the direction of the least soil variation are more variable than plots having their greatest dimension in the direction of the greatest variation. He found that shape exerted no influence on accuracy where soil variation is as great in one direction as it is in the other. Some of his data follow: Adjacent Bows TnoTT 1 3 10 24 Length Bows TfTTT 150 30 15 5 Total Length Bo ws i n Blot. (Ft7T~" 130 130 130 120 Shape of Plot C.V Long in direction of least 17 O^ variation Long in direction of least to. 37 variation Be ct angular 12 . 72 Long in direction of most 10. 3^ variation 147 Similar conclusions were obtained "by Siao (1939) for cotton and by Smith (1938) for wheat . It is generally conceded that relatively long and narrow plots with the long dimen- sion in the direction of the greatest soil variation best overcome the affects of soil heterogeneity. In addition, linear plots are more economical for cultural operations. However, the area occupied by a single replicate or block should approach a square in shape for the most efficient design. VII. P ractical Considerations in Plot Shape Width of plots should be sufficient to a] low for the removal of border rows when this appears desirable, or to render border effects negligible when not removed. The triple rod -row is a convenient shape for small plots, while large plots are usually rectangular in shape to accommodate farm machinery in an attempt to simulate farm conditions. (a) Adaptation to Farm Machinery Some multiple of seven feet provides a favorable width for field experiments as it permits convenient operations of the 3«5 an& 7.0-foot farm implements. Kiessel- bach (1928) calls attention to the fact that the multiple of seven feet will enable the use of the seven-foot disk,, seven-foot drill, seven-foot binder, and 3.5-foot corn planter and cultivator. The standards of the American Society of Agronomy (1933) recommend 14 feet as a minimum plot width for crop rotation, fertilizer, and tillage experiments, while varietal tests with inter-tilled crops commonly should contain at least three or four rows. Extremely narrow plots, in the case of manurial or fertil- izer tests, make it difficult to keep the treatments within the plot limits. (b) Calculation of Pl ot Size Plots should be made an alequot part of an acre, e.g., l/40, 1/50, l/SO-^acre plots. This sort of plan is worthwhile because of the grave possibility of error in computations made on acres expressed as decimal fractions. For instance, to calcu- late the dimensions of a l/40-acre plot for a drill seven feet wide, the steps are as follows: 43,56o/4o = IO89 square feet in l/40 acre. 1,089/7 = 155.6 feet for length of the plot. Hayes (1923) suggests for small grain nursery rows spaced. 12 inches apart, that the row length be adjusted in length slightly so that gram yields per plot can be con- verted to bushels per acre by multiplying by a simple conversion factor. The factor 0.2 can be used for a 15-foot row of oats, the factor 0.1 for a lo-foot row of wheat, and the factor 0.1 for a 20-foot row of barley. VIII. Calculation of Plot Efficiency Some uniformity data on 120 rod rows of Haynes Bluestem wheat in bushels per acre, as given by Hayes and Garber (1927), will be used for the computation of plot efficiency. The method was suggested by Dr. F. E. limner. 25.0 24.9 29.4 28.7 28.0 27.0 28.9 20.9 25.3 24.1 , 22.0, 21.7, , 25.0, 20.7, , 2^. 4, 23.1, , 26.8, 25.2, , 23.7, 26.4, , 23.8, 24.6, > 27.5, 25.2, » 27.2, 26.0, 7 27. 1+, 25.2, , 25.8, 22.1, 22.0, 24.5, 30.7, 29.6, 27.8, 31-7, 27.9, 30.0, 21.6 26.9, 18.6 26.3 25.7 25.4 23.8 28.0 28.2 22.7 28.3 27.6 , 23.5, 20.3, , 25.2, 25.6, , 21.9, 23.2, , 26.5, 24.4, , 23. 1, 28.3, , 28.1, 24.5, , 25.4, 25.8, , 21.6, 19.3, , 25.1, 21.6, , 27.6, 27.3, 19.9, 23.1, 23.2, 24.1, 28.8, 26.7, 28.1, 24.0 24.6; 30.2, 24.9 28.5 21.5 29.3 21.0 31.6 32.0 28.8 25.9 22 .4 , 22.9, 22.9, , 27.0, 26.6, , 25.6, 24.4, , 24.3, 28.9, , 23.8, 24.8, , 23.7, 25.0, , 29.6, 28.3, , 25.2, 26.0, > 24.8; 26.9, , 23.7, 23.1, 25.0, 25.9, 28.2, 25.5, 22.8, 33.0, 27.6, 25.9, 26.5. 148 It is assumed that JO varieties are to be tested, and that the investigator desires to determine the relative efficiency of 1 ; 2, and 4-row plots. The analysis of variance will he used. ( J. ) One Bo w per Plot : Block S(>: b ) S(x 2 b ) T 727.2 528,819.84 II 767.. 6 089,209.70 III 326.7 603,^32.89 IV l60 JB_ ^7 3,3 3-6, 64 Totals 3082.3 2,380,279.13 S(x) for all plots = 3,082.30 £ = 25.685833. S(x 2 ) for all plots - 80,176.37. S(::) 2 /k ■--- 79,171 J+306 S(x 2 ) - S(x) 2 /N = 1,004.9394 s(x 2 b ) - (sx)2 = 79/342.64 - 79,171.43 = 171.21 • 30 N Variation Sum Mean due to D. F. Squares Square Blocks 3 171.21 37 .0690 Varieties and Error llo 833 .Jo _ 7.IS73 Total H9" 1004.94 ( 2 ) Two Rows per Plot : ' Block S(xJ S(x 2 ) _ L..b „ , r : Jb-L._ I 1494.8 2,234,427.04 n 15§Ll5 2,520,156.23 Totals ' ----- - - jogg^ 4.,754,583"29 S I : \_2 - S(x) 2 = 159,661.25 - 79,171.43 = 659.19 2 H ™ -2 The total S(x 2 ) is divided by 2 to place the' results on a single plot oasis so ■ that the common correction factor S (::•;:) 2 /]R , can he used. S (x2- b ) - s(x)2 a 79,243.05 - 79,171.4; 60 N 71 .fsP Variation Sum Mean due to D. F, Squares Square B 1 o cks 1 7 1 . o2 7 1 . 62 Varieties and Error 58 587.98 10. 13 (3) Your Bcvs P er Blot: s(x 2 ) - s(x) 2 - 3 18, 950 . 59 , - 79,171.43 - 561.21 4 N 4 Variation , Sum Mean due to I). F . Squares Square Total ' 29™ ~" ~ " 56iT21 " ~~ '"19.3921 1^9 (k) Comparison of 1 , 2, and k Sow Plots No. Determinations Rows per Plot No. Blocks Mean Square Mean Square Basle One Plot Pet. Efficiency 30 50 50 1 2 k It 2 1 7.1875 10.1307 19.5521 7.1875 5.0655 I4..838O 100.00 70.95 37.1^ U. Replication B — Plot in Experimental Work Replication Replication is merely repetition. The investigator- repeats a variety or treatment several times in a test in order to obtain a moan yield or value which is a more reliable estimate of the yield of the general population than that obtained from a single plot of a treatment. It also provides the mechanism for a valid estimate of the random errors in an experiment . Strictly speaking, five replications of a variety refer to six plots, i.e., the original plot and its repetition five times. For the sake of simplicity, the number of replications will be understood to mean the number of plots grown of each variety or treatment. In field experiments, a single replicate is usually planned to contain one plot of each treatment in a rather com- pact block. The repetition of the treatments is brought about by the repetition of the blocks. This distribution of plots over the experimental area is an effort to sample" the field in an attempt to measure and, in some cases remove, the influence of soil heterogeneity. Replication in space and time is often necessary. For exam- ple, it may be desirable to repeat an experiment in other regions of the state in order to sample different soil and climatic conditions. In the same region, repeti- tion of the experiment over a number of years may be necessary to sample the climatic conditions in different seasons. X. History of Replication Replication of experimental plots has been comparatively recent. In the old field tests large single plots were placed side by side. These were simple and effective for the demonstration of known facts so long as the differences to be observed were large. However, they are inadequate as soon as accurate measurements are needed be- cause they do not take into account the tremendous variation in the soil from plot to plot. Sir John Russell (1931) gives some of the early history of replication. The Broad- balk plots at the Rothamsted Experimental Station were split lengthwise into two halves in 13^6-1+7 which, from that time onwards, were harvested separately. This was the first duplication of field experiments so far as can be determined. In 18^7-^8, and occasionally afterwards, one half of each plot was treated differently from the other with the result that they ceased to be strict duplicates. Better duplication appears to have been practiced by P. Nielsen, founder of the Danish. Experiment Sta- tion about 1870, in his experiments on grass mixtures for pastures. Some Norfolk (England) experiments carried out in the later l880's were systematically replicated as follows: ABCDDCBA. In America, some experimenters began to use replication about 1888. Some old experiments in Kansas were replicated six times. However, replica- tion soon fell into disuse because of the demand for information and due to limited land and funds. Single plots wore the rule. Nothing further was done in England until 1909 when A. D. Hall (1909) and later Wood (1911) urged the need for the esti- mation of experimental errors. Marked changes came about as a result. S. C. Salmon (1913) revived duplication of plots in this country in 1910. Single l/lO-acre plots were commonly used for variety and rate and date of seeding tests with small grains 150 at that time. He split those l/lO-acre plots into l/50-acre plots and replicated them five times for variety tests. His rate and date tests were replicated three times. Thus, the same area was required for variety tests e„s "before and a smaller area for rate and date tests. Largely as the result of his effort s, the Office of Cereal Investigations, U. S. D. A., provided for replication in their work about 1912. In England at about the same time, Dr. E. S. Beaven designed his well-known strip method of replication which is especially suited to variety trials. A questionnaire sent out by the Committee on Standardization of Field Experiments of the American Society of Agronomy in I918 indicated that less than 20 per cent of the agronomic workers depended iipon single plot tests even though they had been the rule 10 years previously. At present, replication is considered essential in modern field experiments. XI . Reduction o f Error by Replication . . The most effective method to obtain greater accuracy in field experiments as well as in many other types of agronomic experiments, is to increase the number of replica- tions. It can be brought about to a limited extent ''oj an increase in plot size as shown by Summerby (1923). However, frequent' replication of small plots^proved to be a more efficient means to obtain a high degree of accuracy than the use of the same . amount of lend with less frequently rer>licated larger plots. Love (1936) gives some uniformity trial data with cotton that indicate the same trend. The probable error for a 2 -row plot 20 feet long was 10.35 per cent, while that for two single 20-foot plots was 9.01. Further, the probable error for a single k-vow plot was 9,51 while for the same area made up of four scattered units it was 7-55 P e ** cent. Many other investigators have obtained similar results. Since the standard error of the mean (o-£) is given by 05; = s , it follows that the decrease in Oj? is proportional to 7 1?""" the square root of the number of replications. This rule applies when the variation due to the replicates themselves is removed from the error, but not strictly other- wise. This can be illustrated with some data on 120 rod rows of bluestem wheat, cited by Hayes and Garber (1927) • The value of replication was studied on the varia- bility of yields calculated separately on the basis of 20 determinations and for 1,2,^, and 6 systematically distributed plots. The coefficients of variability were compared with mathematical expectation as follows: No . No . sy st emat I cally Mathemat leal determinations distributed plots C. V. expectation 20 I 20 2 20 k 20 6 9.05 9.05 S..3h 9.03A/2 = = 6.1+2 5.61 j + . 53 k.hk 3-69 The calculated, coefficient of variability - decreases as a result of replication, but less rapidly than would be indicated, by mathematical expectation. This is attributed to the greater land area used for several replications than can be used for single- plot trials which, on the average, brings in soils of greater difference in producti- vity than can be found in smaller areas. In this case, the error due to blocks has not been removed. That replication beyond a certain point may be impractical Is in- dicated in some data compiled by Salmon (1923) . He shows the relation between the number of replications and the probable error of the mean (expressed In per cent) as follows : Number of Plots 1 2 3 k / 5 0789 Kherson oats 3.7 2.0 2.0 1.7 1.0 1.8 1.7 1.6 1.5 Alfalfa 11.2 713 7-1 5.0 k.Q M 5.5 6.0 5.8 Ear corn 9-0 5.9 5.5 5.1 k.l 3-7 k.l k.l 5.3 151 It is to be noted that variability was rapidly reduced up to k replications, but the decrease was at a much slower rate beyond that point. Hayes and Garber (192?) ques- tioned whether the gain in accuracy beyond three replications warranted the addition- al work. The relation of replication to design will be considered in a later chap- ter. XII. Number of Replications The question naturally arises as to the number of replications that should be used. Goulden (1929) states that it depends upon the degree of soil heterogeneity, the degree of precision required, and the amount of seed available. Any desired degree of precision within practical limits may be ordinarily achieved for any given set of conditions by replication. For field plots, the American Society of Agronomy (1953) recommends 3 to 6 replications, dependent upon the degree of precision required. The smaller number will suffice when average rather than annual results are stressed. From k to 6 replications are commonly used in corn variety trials. Nursery experi- ments ordinarily should be replicated 5 "to 10 times to assure significant results. It is impossible to prescribe a rule for all cases. In rod-row trials with oats, Love and Craig (1938) found 8 or 10 replications more satisfactory than a smaller number, as 3 or 5. In alfalfa nursery plots, Tysdal and Kiesselbach (1939) concluded that k to 1.6 replications were necessary to make a 5 per cent difference statistical- ly significant for plots that varied in size from l/80-acre to a single space-planted 16-foot row. The larger plot required the k replications. However, little is to be gained by the use of more than 10 replications in field trials. References 1. Alwood, W. B., and Price, R. H. Suggestions Regarding Size of Plots. Va. Agr. Exp. Bui. 6. 1390. 2. Barber, C. W. Note on the Influence of Shape and Size of Plots in Tests of Varieties of Grain. Me. Agr. Exp. Sta. Bui. 226, pp. 76-8!+.. 191U. 3. Batchelor, L. D., and Reed, H. S. Relation of the Variability of Yields of Fruit Trees to the Accuracy of Field Trials. Jour. Agr. Res., 12:2^5-283. 1918. h. Bryan, A. A. A Statistical Study of the Relation of Size and Shape of Plot and Number of Replications to Precision in Yield Comparisons with Corn. la. Agr. Exp. Sta. Rpt. for I930-3I, p. 67. 1931. ' 5. . Factors Affecting Experimental Error in Field Plot Tests with Corn. la. Agr. Exp. Sta. Bui. 163. 1933. 6. Cochran, W. G. Catalogue of Uniformity Trial Data. Sup-pl • Jour. Roy. Stat. Soc, U:233-253- 1937. 7. Day, J. W. The Relation of Size, Shaoe, and Number of Replications of Plots to Probable Error in Field Experiments. Jour. Am. Soc. Agron., 12:100-106. 1920. 8. Engledow, F. L., and Yule, G. U. The Principles and Practices of Yield Trials. Emp. Cotton Growing Corp. 1926. 9. Goulden, C. H. Statistical Methods in Agronomic Research. Can. Seed Growers Assn. 1929. 10. Hall, A. D. The Experimental Error in Field Trials. Jour. Bd. Agr. (London), 16:365-370. 1909. 11. Hayes, H. E, Controlling Experimental Error in Nursery Trials. Jour. > Am. Soc. Agron., 15:177-192. 1923/ 12. Hayes, H. K., and Garber, R. J. Breeding Crop Plants. 'McGraw-Hill. pp. 69-77. 1927. 13* — et al. An Experimental Study of the Rod- Row Method with Spring Wheat. Jour. Am. Soc. Agron., 2^:950-960. 1932. 152 Ik . Immer, F. R. Size and Shape of Plot In Relation to Field Experiments with Sugar Beets. Jour. Agr. 'Res., 1+k: 61+9-668. 1932. 15. and Raleigh, S. M. Further Studies of Size and Shape of Plot in Relation to Field Experiments with Sugar Beats. Jour. Agr. Res., 1+7:591-598. 1933. 16.. Justensen, S. H. influence of Size and Shape of Plots on the Precision of Field Experiments with Potatoes. Jour. Agr. Sci., 22:366-372. 1932. 17. Kiesselbach, T. A.. The Mechanical Procedure of Field Experimentation. Jour. Am. Soc. Agron., 20:^33-^2. ' 1928. 18. Studies Concerning the Elimination of Experimental Error in Comparative Crop Tests. Nebr. Agr. Exp. Sta. Res. Bui. 13 . 19l8> 19- Klages, K. H. W. The Reliability of Nursery Tests as Shown by Correlated Yields from Nursery Rows and Field Plots. Jour. Am. Soc. Agron., 2^:k6k-k'j2 . 1933. 20. Li, H. W., Meng, C. J., and Liu, T.N. Field Results in a Millet Breeding Ex- periment. Jour. Am. Soc, Agron., 28:1-15. 195$ • 21. Ligon, L. L. Size of Plot and Number of Replications in Field Experiments with Cotton. Jour. Am. Soc. Agron., 22:689-699. 1930. 22. Love, H. II . Statistical Methods Applied to Agricultural Research. Com. Press Ltd. (Shanghai), pp. 1+0*+ -1+20. I.936. 23. Love, H. H., and Craig, W. T, Investigations in Plot Technic with Small Grains. Cornell U. Memoir, ~2±k . , 1938. 2k. Lyon, T. L. A Comparison of the Error in Yield of Wheat from Plots and from Single Rows In Multiple Series. Proc. Am. Soc. Agron., 2:33-39- 1910. 25. McClelland, C. K. Some Determinations of Plot Variability. Jour. Am. Soc. Agron., 18:819-823. 1926. 26. Mercer, W. B., and Hall, A. D. The Experimental Error in Field Trials. Jour. Agr.. Sci., U: 107-132. 1911. 27. Odland, T. E., and Garber, R. J. Size of Plat and Number of Replications in Field Experiments with Soybeans. Jour. Am. Soc. Agron., 20:93-108. 1928. 28. Olmsted, L. B. Some Applications of the Method of Least Squares to Agricultural Experiments. Jour. Am. Soc. Agron., 6:190-20l+. 19ll+. 29. Report of the Committee on Standardization of Field Experiments. Jour. Am. Soc. Agron., 10:3^5-35!+.. 1918. 30. Report of the Committee on Standardization of Field Experiments. Jour. Am. Soc. Agron., 25:803-828. 1933. 3i. Russell, E. J. The Technique of Field Experiments (Forword) . Rot hams ted Conf., 13, pp. 5-8. 1931. 32. Salmon, S. C. A Practical Method of Reducing Experimental Error in Varietal Tests. Jour. Am. Soc. Agron., 5:l82-l84. 1913. 33. Some Limitations in the Application of the Method of Least Squares to Field Experiments . Jour. Am. Sec. Agron., 15:225-229. 1923 • 3I+. Siao, Fu. Uniformity Trials with Cotton. Jour. Am. Soc. Agron., 27:97^-979- 1935. 35. Smith, K. Fairfield. Comparison of Agricultural and Nursery Plots in Variety Experiments. Jour. Counc. Sci. and Ind. Res., 9:207-210. 1936. 36. An Empirical Law Describing Heterogeneity' in the Yields of Agricultural Crops. Jour. Agr. Sci., 28:1-23. 1938. 37. - , and Myers, C. II . A Biometrical Analysis of .Yield Trials with Timothy Varieties using Rod Rows. Jour. Am. Soc. Agron., 26:117-128. 193^ . 38.. Stadler, L. J. ■ Experiments in Field Plot Technic for the Preliminary Determina- tion of Comparative Yields in Small Grains, Mo. Agr. Exp. Sta. Res. Bui. 1+9 . 1921 , 39. Stephens, J. C .., and Vinall, II. N. Experimental Methods and the. Probable Error in Field Experiments with Sorghum. Jour. Agr. Res., 37:629-646. 1929. 1+0. Summerby, R. Replication in Relation to Accuracy in Comparative Crop Tests, Jour. Am. Soc. Agron., 15:192-199- 1923. 153 kl . Summerby, R. A Study of Size of Plats, Number of Replications, and the Frequency and Methods of Using Check Plats, in Relation to Accuracy in Field Experiments. Jour. Am. Soc. Agron., l^:lU0-1^9. 1925. k-2 , Swanson, A. F. Variability of Grain Sorghum Yields as Influenced "by Size, Shape, and Number of Plots. Jour. Am. Soc. Agron., 22:833-838. 1930. kj. Taylor. F. V. The Size of Experiment Plots for Field Crops. Proc. Am. Soc. Agron., 1:56-58. 1908. bk. Tysdal, H. M., and Kiesselbach, T. A. Alfalfa Nursery Technic. Jour. Am. Soc. Agron. 31:83-93. 1939. k5 . Vestover, K. C. The Influence of Plat Size and Replication on Experimental Error in Field Trials with Potatoes. W. Va. Agr. Exp. Sta. Bui. 189. 192^. ^6. Wiebe, G. A. Variation and Correlation in Grain Yield Among 1500 Wheat Nursery Plots. Jour. Agr. Res., 50:331-357- 1935* b'J . Wood, T. B. The Interpretation of Experimental Results. Jour. Bd. Agr. (London) Supplement, pp. 15-37. 1911. Questions for Discussion 1. What was the early history of the use of field plots? 2. What type of experiment is used to compare different sizes and shapes of plots? Why? 3. What practical considerations usually determine the size of plots used in field experiments? b. What is the general objective of nursery tests and what general relation should they have to field plots? 5. Distinguish between nursery and field plots. 6. What are the common sizes of nursery plots? Field plots? 7. Compare nursery plots and field plots as to accuracy. 8. What is the relation between size of plots and the standard error? Between size of plot and border effect? 9. How may increased size of plot increase the amount of variability? 10. What size of plot has shown the lowest variability for practical purposes with wheat? Corn? Soybeans? Millet? Cotton? 11. What is meant by efficiency in plot size? 12. What reasons can be given for the variability in results with small plots? 13. What is a common size of plots for small grain nurseries? Corn trials? Sorghums? Alfalfa? Sugar beets? Ik. In general, what relation is found between shape of plots and the standard error? 15. What relation, if any, is found between the direction in which plots extend and the standard error? Why? 16. What recommendations would you make on width of field plots for the use of farm machinery? Why? 17. What relation is found between shape of plots and border effect? 18. What modifications can be made in length of nursery rows for wheat, oats, and barley, for rapid conversion of yields to bushels per acre? 19. What is replication? Why used? 20. What has be«*s the general practice regarding replication of plots? What is the practice now? 21. Trace the early history of plot replication. 22. What serious results may result from single (unreplicated) plot trials? Why? 23. What is the theoretical relation between the number of plots and the standard error? Actual relation? Why do they not always agree? 2b. What class of errors does plot replication tend to reduce or eliminate? On what class does it have no effect? 23. Diccuss the statement: "Precision can be' increased Indefinitely by replication." 26. How does replication furnish an estimate of error' 27. Give a general rule or rules for plot replication. Problems 1. It is desired to use l/30-acre plots in a crop rotation experiment and to make them lh feet wide. Calculate the plot length. 2. Gome data reported by Wiebe (1935) are given below for 15-foot rows of wheat; one- foot apart. The yields are reported in grams. Assume I5 varisties, and compute the efficiency for 1, 2, and 4 -row plots. Series 2 Series 3 Series k Scries 1 715 770 76O 663 753 7^-5 6hi 7.85 360 685 755 61+0 725 715 700 595 380 580 710 655 67.5 715 690 690 613 685 353 730 670 380 670 585 560 690 530 520 1+93 ^55 [ '70 5I+0 lj-50 300 730 610 500 810 665 570 635 585 ^63 C53 530 ■ l r55 773 615 5^5 705 355 Mo 5. Calculate the number of replications required to make a 5 per cent difference in yield statistically significant for these sizes of plots: Kind of Plot 05 (in per cent) Field plot ' 3.3 Single -row plot (l'8-inch spacing) 3-2 Single -row plot (2l+-inch spacing) hand -planted 7.0 Use the formula, Oj (2),/ 2 = 5 (percent difference in yield?), where "n" = th« number of 'replications. ____ - _^^^^^^^^^^^^^^_ CHAPTER XIV COMPETITION AND OTHER PLANT ERRORS I. Plants in Relation to Error That soil heterogeneity -will contribute to experimental error has already "been seen. There are also many errors due to plants that may contribute to the experimental error. These may be caused by differences in genetic constitution or variations due to environmental conditions. Variations in plant stand within plots may introduce differential responses due to intra-plot competition, while the effect of one plot on the adjacent one may bring about differences due to inter -plot competition. Other errors related to plants include such "effects as differences in the moisture content of the harvested crop, differences in adaptation, etc II. Acclimatization il serious systematic error may be introduced thru differences in acclimatization of the crops under test, unless acclimatization itself is the factor under consideration. Varieties in crops like corn, alfalfa, and red clover may vary widely in their clima- tic adaptation. Variety tests in corn may be a common source of error in this re- spect. In Nebraska, Kiesselbach (1922) compared Reid Yellow Dent corn grown 100 miles farther north with that grown and adapted at Lincoln. He obtained large dif- ferences in yield, plant height, date of maturity, length of ear, etc., within the same variety when originally grown under different conditions. Lyon (19H) reported similar results for corn and also for strains of Turkey wheat from other states in- cluded in winter -hardiness tests. Differences in varieties may be brought about in a very few years which may introduce either a slight or a very large error. Reliable tests are impossible when varieties are collected from different climatic regions. Each variety should be grown for a year or two in the region where it is to be test- ed until it has undergone the changes incidental to adaptation to the new environment. III. Plant Individuality Plant individuality varies with different crops. It is more marked in cross than in self -fertilized crops, e.g., it would be more important in rye than in barley. The size of plot necessary is influenced by the number of plants grown per plot, as well as by the kind of plant. For instance, it is easily possible to have 1,000,000 wheat plants on one acre, while the number of corn plants is only about 10,000 per acre. Plant individuality would be negligible in the case of small grains, but quite impor- tant in crops like corn and sorghums where the number of plants per plot may be quite low. Lyon (19H) found that quite a large error may be introduced by yield deter- minations from a small number of plants due to the variations in growth of certain individuals. For maize, he showed that the effect of plant individuality was prac- tically none when each plot was composed of 100 plants. IV . Variation in Moisture Content of Harvested Crop In forage and cereal crops the variation in moisture content of the harvested crop may be an important source of error in yield determinations. For precise experimen- tal results, this condition should be recognized and a remedy provided for it. (a) Moisture in Forage Crops Obviously, the most accurate method to determine the water or. dry-matter con- tent of the forage grown on a plot is to dry all the material to a water -free basis. Since it is impossible to do this, dry matter determinations are based on small shrinkage s amp 1 e s . -155- 156 The problem of moisture determination is rather simple under semi -arid con- ditions where forage is readily field-cured. Forage weights arc usually taken after the material is dry enough to stack, talcing a shrinkage sample at that time. The sample is weighed immediately and allowed to air -dry for 2 or 3 weeks after which it is re-weighed for the air-dry weight. Yields" corrected on this "basis are found to he reliable. In case moisture -free determinations are necessary, the samples of each variety or treatment may be composited, ground, and dried in a vacuum oven for 12 to 2k hours . Under humid conditions, reliable comparisons from the weights of field-cured forage cannot be made, except on rare occasions that cannot be predicted. As a re- sult of the work of Farrell (191*1-); McKee (191k) j 'Vinall and McKee (1916), and Amy (I9l6), the general practice has been to weigh the forage as soon as cut, and sampling it for air-dry or water-free determinations at that time. These green samples are usually placed in a drier" at once to avoid the loss of dry matter thru oxidation, fermentation, etc. The investigations of Wilkins and Eyland (.1938) indicate that the samples should be taken and weighed within k to 6 minutes after the forage is cut to avoid error due to moisture less. These workers -also found that the error introduced thru the use of green weights of alfalfa and red clover for plot yields without dry matter determinations was negligible so long as the weights were taken quickly. Yield determinations on the basis of green weights proved to be as accurate as where 2 or 3 samples per plot served as a basis for moisture determination, and subsequent yield c orr e ct i on . (b) Moisture in Cer eals The moisture content of small grains is usually of little consequence since the bundles are usually air-dry before threshing is attempted. The threshed grain may be weighed at the threshing machine and re -weighed a week later to be sure it has reached a uniform moisture content. The determination of moisture in shelled corn is regarded, as an essential practice for obtaining precise yields. Moisture determinations can be made on each plot of each variety, or a composite determination for the variety on all replications. A common practice is to report yields on the basis of shelled corn with 15 • 5 per cent moisture, the maximum moisture permitted for U.S. No. 2 corn. Moisture determinations for corn or small grains can be made quick- ly with the Brown-Duvel moisture tester described oy Coleman and Boerner (1927) • Recently, the Tag-Heppensta.ll moisture meter, an electrical device, .has been widely used for moisture determinations. This meter is calibrated for wheat, corn, oats. barley, rye, sorghums, rice, soybeans, and vetch. The electrical moisture meter has certain advantages for practical work: (l) It is unnecessary to clean after each sample; (2) the sample is not weighed; (3) a single determination can be made in less than one minute; (4) it will duplicate results within a tolerance that can- not be met in a single determination by other methods, and (5) the operation and maintenance cost is low. Cook, et al (I93V) have made a study of rapid moisture determination devices. When determinations are made on each variety in each repli- cate, single rather than duplicate determinations should be sufficient. V . C ompetition C oncept in Pl ants A "struggle for existence'' results when plants are grouped or occur in communities in such a way that the demands for an essential factor are in excess of the supply. This is true in many field trials. Competition always occurs when two or more plants make demands for light, nutrients or water in excess of the supply. It is greatest between individuals of the same species which make similar demands upon the supply at the same time. This is generally the case in cultivated crops where an area is planted to the same species or variety. A detailed discussion on the nature of plant competition is given by Clements and others (1929) 157 A number of investigations have "been conducted to determine the importance of plant competition in experimental plots. There is apt to "be an effect -when varieties that differ considerably in growth habit, time of maturity, and other characters are grown in adjacent plots. The principal contention is whether or not the yield of a poorer variety growing next to a high yielding variety will he adversely affected so that the yield will be actually lower than when the variety is grown next to a plot of its own kind. Competition may or may not influence plot yields. Two distinct schools of opinion have arisen as to its importance. In areas of limited moisture supply, competition has been generally found to be a source of error in comparative crop tests. Kiessel- bach (1918) obtained errors of 2k and ^6 per cents due to plant competition in two different years. Hayes and Arny (1917) found errors in small grain yield trials where varieties competed with each other. In Missouri, Stadler (1921) reported errors of 50 to 100 per cent due to plant competition. Some workers in the more humid re- gions, where moisture is often sufficient throughout the season for ordinary stands, consider competition effects unimportant. For example, Stringfield (1927) found only occasional disturbances in Ohio, while Garter and Odland (1925) failed to find evi- dences of competition in adjacent soybean rows on the West Virginia Station. Love and Craig (1958) concluded that the effect of competition is not serious enough to influence the yields of wheat and oats under New York conditions. The influence of plant competition depends upen the test being conducted, but the possible error from this source should be kept in mind constantly. It is a safer procedure to eliminate or provide for this source of error than to be led to erron- eous conclusions by overlooking it. A -- I ntra -plot Competition VI. Uneven Plant Distribution in Plots Plants within a plot are in competition with each other when some factor such as moisture is present in insufficient quantities. Uneven plant distribution, with a normal number of plants per plot, was studied in corn by Eiesselbach and Weihing (1955) to determine whether or not this condition would alter acre yields. Corn was planted in hills 3*5 feet apart so as to average three plants per hill. The three systematically uneven distributions were planted so as to have 2-*J-, 1-3-5* and 1-2-3 J+-5 plants in alternate hills. Essentially uniform stands of three plants per hill were grown for comparison. During a ik-je&r period the systematically variable stands of 2-k, 1-3-5-, and 1-2 -3 -4-5 plants per hill averaged 50.6, 1*9,3, and 50.0 bushels per acre, respectively. The three variable stands averaged 50. bushels while the uniform 3 -plant stand averaged ^9*9 bushels per acre. In another trial, these invest i gators tested a random variable stand by planting corn that germinated 100, 75; 60, and 50 per cents at adjusted rates to average three viable kernels per hill. The plot yields for a single season were 2>+.96, 2p.50, 25.3^, and 25.12 bushels per acre for the respective germination per cents. From these data, it was concluded that systematically and randomly variable stands did not affect the yields so long as the same number of plants occurred on a plot. The authors caution that "experience has indicated that stand irregularities materially greater than those herein considered, such as are sometimes caused by rodents, worms, birds, and soil washing, would undoubtedly increase plant variability and lower the yield." A similar type of study was conducted by Smith (1937) on 2 Australian wheat fields planted by a farm drill in which short lengths of drill row were harvested separate- ly. Variability of plant density as found in a drill-sown field did not b;y itself cause a decrease in yield of grain as compared to even spacing of seed. The correla- tion of yield and plant number per foot of drill row, which is invariably observed 3.58 in small grain fields, was said, to "be due to the effects of- competition "between near- ly densities. He makes this statement: "The true correlation "between yield and plant density per area may he positive , negative , or zero according to circumstances . The yield from variable seeding may he less than, equal to, or even slightly greater than the yield from even seeding, according to how near the even seeding may he optimum for the given conditions and. how far the variable seeding may fall within or overlap the optimum range within which -plant density is of little importance." VII . Differences in Stan d in Plots The ideal condition for yield trials is a perfect stand, on all plots, hut this is not always attained. Some allowance must he made for the lost area where the stand, is injured, by outside influences, particularly with some crops. When the loss in stand is due to the treatment, I.e., it injures germination, destroys part of the plants, or in any manner is directly responsible for stand, the use of a perfect - stand basis for yield calculations eliminates the effect of the treatment. The lethal effect of the treatment may be a, definite part of the- results obtainable and. should, be given consideration. The effect of stand within plots is more of a prob- lem in crops like corn, sorghums, certain legumes, and sugar beets where the plants are large, variable inter se", and subject to the influences of plant competition. Less difficulty is experienced in small grains because the plants tend to tiller and fully utilize the extra, space. (a) Competition between Unlike Kills in Corn Relative yields of one, two, and three -plant corn hills uniformly surrounded ~bj three-plant hills was studied ~bj Xiessolbach (1918) (l'f'2'd). He found the yields to be 61 82, and 100 per cents for the two, and three plants per hill It obvious that the fewer plants per hill made some use of the additional space. In another test, the relative yields of three-plant hills were compared when adjacent to hills with various numbers of plants. 3-plant hills surrounded by 3-plant hills except as Indicated below: Average Grain Yield per Hill Actual (lbs.) Relative (pet.) Surrounded, by 3 -plant hills Adjacent to one hill with 2 plants- Adjacent to one hill with 3. plant Adjacent to one blank hill x . u ( p 1.098 1 .-..pi 1 :22k 100 102 10? nif- it is obvious that 3-plant hills adjacent to blank or 1 or 2 -plant hills tend to yield higher than when surrounded by 3-pl a -tdc hills. In comparisons of inbred lines and F-|_ hybrids, Brewbaker and lamer (1951) found, that a rather large error may be introduced in yields where hills have reduced stands or are adjacent to hills which lack in stand, however, under a 3-plant rate in com it is generally conceded that 10 to 15 per cent of the stand may be lost before the yield is measurably reduced or the experimental accuracy affected in ordinary yield, trials. (b ) C ompetitive vs . Non-Competitive Yields in Su gar 3eets Sugar beet tonnages are usually reported, as (I~) total weight of all beets on a unit area, or as (2) a calculated yield, from "normally competitive" beets., The beets which serve as the basis for calculation are those grown surrounded by neigh- bcrs on all sides at appropriate distances for the conditions imposed in the experi- ment., A study, of the response of sugar beets to increased space allotment was made by Brewbaker and Deming 0-95. : >)« Their plants were grown in 20 -inch rows and thinnt d to 12 inches between plants in the row. Beets around a single blank space were foun^ to increase in weight sufficient to compensate for pfc>.2 per cent of the loss of a single beet. They obtained increases of 28.7, 39-2, and. 95-0 per cents for beets 159 adjacent to one "blank space in the same row, • "between two "blank spaces in the same row, and with "blanks on four sides, respectively. It is evident that the "beet weight was greatly influenced "by the relative area available for its development. The re- gression of weight of "beets upon stand was essentially linear for stands between 25 and 75 per cent . For each 10 per cent increase in stand there was an increase from O.76 to 2.10 tons beets per acre for the regression within blocks. There may be sit- uations where yields based on competitive beets would be in error, particularly in poor stands and in spacing tests . Such instances have beer pointed out by Nuckols (1956). He harvested actual and competitive beets on 25<+ plots where the stand varied from 50 to 100 per cent . A greater difference in competitive and actual yields was obtained for poor stands than for good stands. In fact, the mathematical possibilities showed that there are only 35 P^r cent of competitive beets in a 90 per cent stand, "'0 per cent in an 80 per cent stand, and 5 per cent in a 70 per cent stand. This indicates the greater possibility for error when competitive beets are taken from poor stands. Nuckols also found an indication that there is a greater difference between competitive and actual yields where the beets are closely spaced than where more widely spaced in the row. It is obvious that in rate of spacing tests, the method of selection of competitive beets is not the same for all plants. (c) Stand Effects in Other Crops In potatoes, Livermore (1927) reports that the yield of the two hills adja- cent to a blank may be kO per cent more than that for hills surrounded by hills. Werner and Kiesselbach (1929) found 62.5 per cent of the loss in yield was recovered in potatoes adjacent to one-hill blanks. In alfalfa nursery plots, Tysial and Kies- selbach (1959) found for variable seed rates that stands tended to equalize after k years. Considerable latitude in the amount of seed sown per row was possible with- out serious effects on comparative varietal performance. VIII. Corrections for Uneven Stands A great deal of attention has been given to possible corrections for loss of stand. It should be emphasized that ,there is no entirely satisfactory method to correct for uneven stands, it being better practice to prevent them so far as possible. One method to avoid poor stands is to plant thick and thin the young plants to the de- sired rate. For example, where a stand of 3 -plants per hill is desired in com, the experimenter may plant 6 kernels per hill and subsequently thin the seedling plants to 3 per hill. Most empirical methods for the correction of yields on a stand basis are based upon plants surrounded by the normal stand, i.e., competitive plants. Stewart (1919) (1921) gives a formula for the correction of stand errors in potatoes where the stand is relatively satisfactory. The practice in corn experiments is to harvest the entire plot without stand corrections when the stand is 90 per cent of the theoretical or better. For less than that, it is usually harvested on a perfect - stand basis. Kiesselbach (1918) (1923) selects only perfect-stand hills surrounded by hills with the same stand and computes the yields from these. Bryan (1933) found that 26 per cent fewer hills were required to obtain any given degree of precision with only perfect-stand hills than with all hills regardless of stand. Adjustment of the yields of perfect stand hills further reduced the number of hills required for any degree of precision by 18.9 per cent. The procedure in the U. S. Department of Agriculture for the uniform com hybrid tests is to adjust yields for missing hills but not for minor variations in stand. Probably the most satisfactory method for the adjustment of yields on the basis of stand is by covariance in which the regression coefficients are calculated. Mahoney and Baten (1939) have outlined its use for this purpose. When there is a fairly high variation due to soil heterogeneity and no appreciable differences in stand, usually nothing is gained by adjustment* 160 B -- Inter-plot or Border Effect Competition IX. Types of Inter^plot Competition Many studies have been conducted to determine the border effects of adjacent plots. The. committee on the standards for the conduct of field experiments for the American Society of Agronomy (3-935) makes this statement: "In a majority of soil experiments and in many cultural and variety tests-,, plot yields may be modified by contiguity to other treatments, crops, or interspaces. Border competition in adjacent unlike plots often raises some yields and lovers others." A vigorous variety may benefit when grown next to a poor one, particularly in single-row plots. The same type of error may be introduced in rate and date of planting tests. As a result, multiple-row plots are often used in experimental work with the border rows discarded. This procedure is justified on the basis of experimental data which indicate that the yield order may be changed when border rows are included in the plot yields, according to Arny (1921) . In some fertilizer and cultural experiments alleys between plots are neces- sary because the treatment may spread to the next plot through faulty application. X. Effect in Variety Tests Most tests to determine the amount of inter-plot competition have been on the basis of single-row vs. multiple row plots with the borders discarded. (a) Small Grains It is concluded by Hayes and Amy (1917) that there is considerable competi- tion between rod rows of small grains when grown one-foot apart. This led to the adoption, of three-row plots for small grain variety tests at the Minnesota station. Comparisons of three-row plot yields with the central rows showed that the latter are as accurate for yield determinations as attained by the use of all three rows. Kiesselbach (I9I8) found that competition caused Big Frame wheat to yield 10. 3 and 12.!+ per cent too high in 1913 an - 191^> respectively, when grown in alternate rows with Turkey. Burt oats yielded lo and 38 per cent too high for these years when grown in alternate rows with Kherson. Stadler (1921) found competition in small grains to be more extreme between different varieties than between different commer- cial strains of the same variety. As a result, it is almost the universal practice to grow small grains in multiple-row plots and discard at least one border row from each side, at harvest for small grain nursery plots. The use of single-row or 3-row plots with all rows harvested appears possible under humid conditions where competi- tion appears to be slight. (See Love and Craig, 1938) (b) Cora. As early as 1909? Smith (1909) found one-row plots too narrow for fair tests in corn when varieties of diverse characteristics were planted in adjacent rows. A variety with short stalks was at a disadvantage when grown next to a taller one becau.se of shading; or a variety with "strong foraging powers" may compere more suc- cessfully for moisture and plant food over a weaker or slower growing neighbor. Kiesselbach (1922) (1923) found that where large and small varieties of corn were grown in alternate rows, the smaller variety yielded 66 per cent as much as the larger one, and only k'J per cent as much when both were planted in the same hill. The smaller variety yielded 85 per cent as much when planted in alternate 5-row plots ana the three center rows harvested for yield. That the smaller variety was being robbed of light, water, and nutrients was shown by the yields where each variety was surrounded by its own kind. "(c) Other Crops Competition between soybean varieties was studied by Brown (1922) in Connecti- cut. Twenty-five single -row check plots of a small and early soybean variety averaged 161 26.9 "bushels of seed per acre. When the check was adjacent to larger and later varieties like Mammoth Yellow, the checks averaged only 17.1 bushels or 63.6 per cent as much as the average of all checks. In potatoes, he concluded that yields were not influenced "by competition between single-row plots. In alfalfa, solid-drilled plots with a 7-inch row spacing has been shown to be definitely subject to serious interplot varietal competition. The work of Tysdal and Kiesselbach (1939) indicates that the effects could be overcome when the border rows were discarded at harvest. When the alley space between plots was widened to 12 inches a significant interaction between varieties was also prevented. The rela- tive yields from single or multiple-row plots with either 18 or 2^-inch row spacing likewise exhibited no significant differential interaction. Immer (193^) made a study of the effect of competition between adjacent rows of different varieties of sugar beets, i.e., "Old Type" and "Extreme Pioneer". These were grown in alternate single-row plots and also in 4-row plots with the bor- der rows removed for yield deteiminations. When grown in single-row plots the "Old Type" brand yielded 3«78 + O.kk tons more per acre than "Extreme Pioneer." In ^-row plots, with the central two rows alone being harvested, the increase of "Old Type" over "Extreme Pioneer" was only I.78 * O.Jl tone per acre. The difference between these two differences was 2.00 * 0.5^ tons, a value that is significant. Thus, "Old Type," the higher yielding sort, profited at the expense of "Extreme Pioneer" when these two brands were grown side by side in single-row plots. In cotton variety tests, Christidis (1937) found that competition may cause a definite bias in the estimation of comparative yields of cotton varieties. Han- cock (1936) tested two cotton varieties with diverse growth characteristics. The varieties were: Acala, a late tall variety, and Delfos, an early semi -dwarf type. The varieties were arranged in these combinations with the series alternated: DDDDAD and AAAADA. He observed that Delfos with Acala on on3.y one side (DDA) showed very small differences when compared with themselves between their own border rows (DUD). For instance, DDD as an average for four years produced only 1 ,h per cent more seed than DDA, while AAA produced ^.01 per cent less than AAD. Where two rows of the same variety are planted, only one row would be affected by a different variety. Since he found this effect to be small, two-row plots were advocated with both har- vested for yield. Such a procedure may be satisfactory under conditions of abundant moisture, but would be questionable where habitat factors are severely limited. XI. Rate and Date Tests Under most environmental conditions competition will exist between plots in rates and dates of planting tests. Hulbert (1931) presents data to show that border effect on outside rows increases as the rate of seeding is increased. The border effect on Bed Bobs wheat was lk'J .85 per cent when seeded at the rate of three pecks per acre, 175.^1 per cent for five pecks, and I73.OI for seven pecks. Kiesselbach (1918) tested two rates of planting for Turkey wheat, a thin and a thick rate. The thin rate yielded 68 per cent as much as the thick rate when grown in alternate single- row plots, and 90 per cent as much when grown in alternate five-row plots. Competi- tion between alternate single-row plots for two rates for Kherson oats caused the thin rate to yield 20 per cent too low in 1913 and 3^.3 per cent too low in 191^. Nebraska White Prize corn was planted in alternate rows so as to obtain two and four plants per hill. Due to competition the thin rate yielded relatively 29. and 9.0 per cents too low in different years. Similar results would be expected in date of planting tests. Klages (I928) found a marked degree of competition in spacing tests with sorghums. Yields of rows with dense stands profited at the expense of the yields of adjacent rows with thinner stands. The degree of competition was influenced by environmental conditions. 162 XII. Border Effect Plants that grow along the aides and ends of plots are often more thrifty and vigor- ous than those in the interior. This is particularly true when the plots are sur- rounded by alleys. Border effect is considered here to mean the effect of "blank alleys on the "border rows. The amount and extent of this border effect is important in comparative crop tests. (a) Small G rains Amy and Hayes (1918) and Amy (1921) (1922) studied (1) the distance alley effect is operative within plots, (2) the increase in yield duo to alley effect, and (3) the influence of additional alley space on variety response. They used small grain plots composed of 16 drill rows six inches apart . The yields of the "border row wero compared with those of the center rows. Amy (1921) gives some typical data: at s Who at B ar 1. e y Description Bu. Pet. Bu^. Pet. Bu. Pet. Outside border rows 65.58 199-9 30,56 153.6 Middle border rows ■ 58.53 IJOik 25.75 127.1 Inside border rows ^9-95 1^2 .3 22.23 111.7 Central rows 32. oO 100.0 19.90 100.0 W.93 .213.5 k2,7k 136.5 35.56 IkS.k 22 . 92 100.0 As an average for three years the yields of outside rows of oats, spring wheat, and barley expressed In per cent based on the yields of the central rows is 199-8 and that for the middle rows I38.O when the plots were surrounded by l8~Inch clean-culti- vated alleys- Border effect was relatively unimportant when extended to the third drill row. Knowledge that border effect is not uniform precludes the use of any percentage figures derived in one place to reduce yields secured in another location to a border-effect -free basis. Arnj (1921) further showed that the rank of a variety may be changed due to "border effect. In all cases, plot yields were higher than where these rows were eliminated before harvest. Hulbert end Eemsburg (1927) found it necessary to discard two border rows from each side of small grain plots to remove the error in border effect in variety tests. Competition effects were noticeably increased when the adjacent plots were seeded at different rates. Hulbert, et al. (1931) obtained similar results. Robertson and Koonce (193*0 studied border effect on Marquis wheat grown in plots irrigated at different stages in its relationship to yield when different numbers of border rows were included. The yield increased as the size of plot increased but the percentage increase was uniform for the three different treatments employed. Comparable yields were the same for plots of 10 rows, and for 10 plus 2, k, or 6 border rows. (b) Other Crop s In kafir and milo, Cole and Hallsted (1926) obtained marked increases in yield from outside rows. The excess yield was roughly proportional to the increased available soil area. Recently, Conrad (1930 has called attention to the fact that sorghum plants next to uncropped. areas may use soil moisture six feet away laterally. A definite use of nitrates was made four feet away laterally for both sorgo and corn. The influence of border effect on total dry matter per plot was studied at the Cen- tral "Experimental Farm (Ottawa) by McRostrie and Hamilton (15-7). In all cases, border plants of Western rye grass gave an increased yield due to the influence of the two-foot pathway which surrounded the plots. The increase in yield differed with the strain under test, and varied from 6 to ^k per cent. The rank of the strains was materially changed due to the wide variation in border yields. When theoretical plots 1/72.6-acre in size were used for red clover and alfalfa forage yields, Hollowell and. Heusinkveld (1933) found a serious experimental error in yield when border rows wore included in the harvested plot. Their plots were composed of 8, 12, and lo-inch 163 alleys. The Inclusion of border rows increased the yield from 2.1 per cent to 20.0 per cent for red clover and from 1.8 to lt.O per cent for alfalfa. Border effect was greater on the first than on the second alfalfa crop, hut varied greatly from year to year under Ohio conditions. Rainfall appeared to he directly correlated with border effect. These investigators concluded that the -discard of two border rows would effectively eliminate border competition on plots of this size. Similar results wero obtained by Tysdal and Kiesselbach (1939) when they compared dissimilar adjacent al- falfa, plots that differed as to spacing of rows or plants. A solid-drilled block with 7 -inch row spacing was separated by a 7 -inch alley space from a space -planted block with rows 2^-inches apart. The adjacent border rows were compared with their respective types of interior rows. The solid-drilled rows gave an excess yield of 'Jk per cent because of reduced competition on one side, whereas the space-planted row was depressed 63 per cent in yield because increased competition. It is evident that great care must be exercised in taking yields from adjacent rows that are affected with respect to row -space or density of stand. XIII. Control of Inter -plot Competition Inter-plot competition can be controlled by several methods. Hayes and Garber (1927), Kiesselbach (1918) (1923) and others give these recommendations: (1) group varieties with similar growth habits, dates of maturity, etc., together; (2) use of multiple- row plots; and (3) discard outside border rows and ends at time of harvest. . Alleys are sometimes used in closely-sown crops such as small grains and forage crops to facilitate harvest and to reduce mixtures. In small plots the borders should be re- moved, but in large field plots it is generally satisfactory to harvest the entire plot and to include the additional alley space in the plot area. Untreated inter- spaces of sufficient width to avoid serious soil translocation are recommended for permanent soil fertility, rotation, and tillage experiments. These alleys can either be cropped or left bare. References 1. Arny, A. C. The Dry Matter Content of Field Cured and Green Forage. Jour. Am. Soc. Agron., 8:358-363. 1916. 2. Border Effect and Ways of Avoiding It. Jour. Amer. Soc. Agron., 1^:266-278. 1922. 3. , and Hayes, H. K. Experiments in Field Technic in Plot Tests. Jour. Agr. Res., 15:251-262. 1918. h. Further Experiments in Field Technic in Plot Tests. Jour. Agr. Res.. 21:^83-^99. 1921. 5. Brewbaker, H. E., and Immer, F. R. Variations in Stand as Sources of Experimenta n Error in Field Tests with Corn. Jour. Amer. Soc. Agron., 23 :k6^-h&l . 1931. 6. , and Deming, G. ¥. Effect of Variations in Stand on Yield and Quality of Sugar Beets Grown under Irrigation. Jour. Agr. Res., 50:195-210. 1935. 7. Brown, B. A. Plot Competition with Potatoes . Jour. Amer. Soc. Agron., 1^:257- 258. 1922. 8. Bryan, A. A. Factors Affecting Experimental Error in Field Plot Tests with Corn. la. Agr. Exp. Sta. Bui. 163. 1933. 9- Christidis, B. G. Competition Between Cotton Varieties: A Reply Jour. Amer. Soc. Agron., 29:703-705. 1937- 10. Clements, F. E., Weaver, J. E., and Hanson, H. C. Plant Competition: An Analysir of Community Functions. Carnegie Institution of Washington. I929. 11. Cole, J. S., and Hallsted ; A. L. The Effect of Outside Rows on the Yields of Plot a of Kafir and Milo at Hays, Kansas. Jour. Agr. Res., 32:991-1002. 1926. i6h 12. Coleman, D. A., and Boomer, E.G. The Brown-Duval Moisture Tester and How to Operate It.. Dept. Bui. 1375, U, S. D. A, 192?. 13. Conrad, J. P. Distribution of Residual Soil Moisture and Nitrates in Relation to Border Effect of Corn- and Sorgo. Jour. Amer. Soc. Agron., 29:367-378. 1937. lit-. Cook, W. II., Hopkins, J. ¥., and Geddes, W. F. Rapid Determination of Moisture in Grain. Can. Jour. Res., 11:26^-239, and kOO-kk'j . 193)1. 15. Farrell, F. D. Easing Alfalfa Yields on Green Weights . Jour. Am. Soc. Agron., 6:h2-ko. I91I+. 16. Garter, R. J., and Odland, T . E. Influence of Adjacent Rows of Soybeans on One Another. Jour. Amer. Soc. Agron. , 18:605-607. 1925 • 17. Granthara, A. E. The Effect of Rate of Seeding on Competition in Wheat Varieties . Jour. Amer. Soc. Agron., 6,:12'i--128. 191*1. . 16. Hancock, E.I. Row Competition and its Relation to Cotton Varieties of Unlike Plant Growth. Jour. Amer. Soc. Agron., 28:9^8-957'. I936. 19 • Hayes, E. K., and Amy, A. C. Experiments in Field Tochnic in Rod Row Tests. Jour. Agr. Res., 11:399-^19. 19-17. 20. , and Garber, P.J. Breeding Crop Plants, pp. 75-79- 1927 . 21. Hollowell, E. A., and Heusinkveld, D. Border Effect Studies of Red Clover and Alfalfa. Jour. Amer. Soc. Agron., 25*779-789. 1933 • 22. Hulbert, H. W., and Remsberg, J. D. Influence of Border Rows in Variety Tests of Small Grains. Jour. Amer. Soc. Agron., 19:585-590. 1927. 23. , , et al. Border Effect in Variety Tests of Small Grains. Idaho Agr. Exp. Sta. Tech. Bui. No. 9- 1931. 2k. Immer, F. E, Varietal Competition as a Factor in Yield Trials with Sugar Beets. Jour. Amer. Soc. Agron., 26:259-261. 193^ . 25. Kiesselbach, T. A. Competition as a Source of Error in Comparative Corn Tests. Jour. Amer. Soc. Agron., 15:199-215. 1 923. 25. Corn Investigations. Nebr. Agr. Exp. Sta. Res. Bui. No. 20. 1922 . 27. _ , , and Woihing, R. M. Effect of Stand irregularities upon the Acre Yield and Plant Variability of Corn. Jour. Agr. Res. ^7:399-i+l6. 1933- 28. , Studies Concerning the Elimination of Experimental Error in Comparative Crop Tests, Nebr. Agr. Exp. Sta. Res. Bui. 13 • I918. 29. ICLages, K. H . Yields of Adjacent Rows of Sorghums in Variety and Spacing Tests. Jour . Amer . So c . Agron . , 2 9 : 582 - 599 . 1928 . 30. Livermore, J. R. A Critical Study- of Some of the Factors Concerned in Measuring the Effect of Select-' on in the Potato. Jour. Amer. Soc. Agron., 19:857-896. 1927 . 31. Love, H. H., and Craig, V. T. Investigations in Plot Technic with Small Grains. Cornell U. Memoir 2l4. 1938. 32. Mahoney, C. E, , and Baten, W. D. The Use of the .Analysis of Covariance and its Limitation in the Adjustment of Yields based upon Stand Irregularities 1 Jour. Agr. Res., 58:317-320. 1939 . 33" McKee, E. Moisture as a Factor of Error in Determining Forage Yields. Jour. Amer . Soc. Agron . , 6:113-117. . 1 91U . 3 1 '-. McRostrie, G. P., and Hamilton, R. I. The 'Accurate Determination in Pry Matter in Forage Crops. Jour. Amer. Soc. Agron., 19:2^3-2fjl. 1927. 35- Nuckols, S. B. The Use of Competitive Yield Data from Sugar Beet Experiments. Jour. Amer. Soc. Agron., 28:92U-93^. 1936. 36. Robertson, D. W. , and Koonce, D. Border Effect in Irrigated Plots of Marquis Wheat Receiving Water at Different Times. Jour. Agi . Res., J+8: 157-166. 193**- • 37'. Smith, H. Fairfield. The Variability of Plant Density in Fields of Wheat and its effect on Yield. Counc. Sci. and Ind. Res. Bui. 109 (Australia). 193 7 - 38. Stadler, L. J. Experiments in Field Plot Technic for the Preliminary Determina- tion of Comparative Yields in the Small Grains. Mo. Agr., Exp. Sta. Res. Bui. No. k9. 1921. "■'■ " 165 39* Standards for the Conduct and Interpretation of Field and Lysimeter Experiments. Jour. Amer. Soc. Agron., 25:803-828. 1933. he. Stewart, F. C. Missing Hills in Potato Fields: Their Effect upon Yields. New York State Agr. Exp, Sta. Bui. 1*59, pp. 1*5-69. 1919. kl m , Further Studies on the Effect of Missing Hills in Potato Fields on the Variation in the Yields of Potato Plants from Halves of the same Seed Tuber. New York (Geneva) Agr. Exp. Sta. Bui. 1*89. 1921. 1*2. Stringfield, G. H. Intervarietal Competition among Small Grains. Jour. Amer. Soc. Agron., 19:971-983. 1927- 1*3. Tysdal, H. M., and Kiesselbach, T. A. Alfalfa Nursery Technic. Jour. Am. Soc. Agron., 31:83-98. 1939. hk. Vinall, H. N.., and McKee, Poland. Moisture Content and Shrinkage of Forage and the Relation of these Factors to the Accuracy of Experimental Data. Dept. Bui. 353, U. S. D. A. 1916. 45. Werner, H. 0., and Kiesselbach, T. A. The Effects of Vacant Hills and Competi- tion upon the Yield of Potatoes in the Field. An. Proc. Potato Assn. America, 16:109-120. 1929. 1*6. Wiebe, G. A. The Error in Grain Yield Attending Misspaced Wheat Nursery Rows and the Extent of the Misspacing Effect. Jour. Amer. Soc. Agron., 29:713-716. 1937. 1*7. Wilkins, F. S., and Hyland, H. L. The Significance of Dry Matter Determinations in Yield Tests of Alfalfa and Red Clover. la. Agr. Exp. Sta. Res. Bui. 21*0. 193B. Questions for Discussion 1 . Why should pure seed \ ?. used in variety tests? 2. How may differences Iin acclimatization introduce errors in crop tests? How can they be avoided? 3. When or with what crops or under what conditions is plant individuality a factor to be considered in planning experiments? 1*. When is moisture content of the crop a factor of importance? How may the error be eliminated or corrected? 5. Could you secure comparable forage yields by taking green weights? Why? 6. What are the advantages of the vacuum oven over an ordinary oven for securing moisture-free weights? 7. Compare rapid moisture determining devices for cereals. 8. What is meant by plant competition? Who have emphasized its importance? 9- Is competition universally present in experimental plots? Is it always objec- tionable? Explain. 10. What effect does severe competition have on plants? 11. How can you reconcile the fact that some workers claim plant competition is a fruitful source of error in experimental work, while others contend it is negligible? 12. Do stand irregularities in corn affect the yield so long as the same number of plants per unit area is involved? Explain. 13. Why may a variable stand in a wheat field yield as much as an evenly-spaced stand? Explain. Ik, Why is intra -plot competition in small grains unimportant from the practical standpoint? 15. What is the general effect in corn hills surrounded by hills with different num- bers of plants? Why? 16. What is meant by "normally competitive" in calculation of sugar beet yields? 17. What is the effect of adjacent blank hills on the weights of individual beets? 18. Under what conditions may yields from "competitive" beets result in errors in yield? 166 . ■ 19. What recommendations would you make as to correcting for uneven stands? 20. What is the general practice for the prevention of errors due to uneven stands in .corn and sorghums? 21. How are stand errors generally corrected in corn plots? 22. What is meant by "border competition? 23. How may errors be introduced in variety tests by use of single-row plots?' 2k. How could you .possibly justify two-row plots in cotton variety tests with no borders removed? Single-row alfalfa plots? 25. How does competition introduce errors in rate and date tests? 26. Under what conditions may it be desirable to have blank alleys surrounding plot: 2V. What influences do border rows have on plot yields? 2o. Is it always necessary to remove borders for the determination of plot yields? Why? '■•-.': 29. What recommendations would you make for the control of inter -plot competition? Pro blems 1.. Explain how to arrange and conduct an' experiment with 10 varieties of corn so as to control both intra and inter-plot competition. 2. The yield of field-cured hay on a l/lu-acre plot is *K30 lbs. The shrinkage sample taken at that time weighed 3.8 lbs. After 3 weeks it weighed 3 .k lbs. Calculate the yield per acre of the plot on an air-dry be sis. 3- The yields (marketable ears) and stands of 6 strains of sweet corn for k replica- tions were as fellows: Data from. Malionoy and Bat en. Yie Id and Stand for Strain Number: Item 1 2 3 4 5 Replication 1 . Held (x) 56 31 21 23 30 ■ 60 Stand (y) 77 68 61 83 70 3)4 Replication 2 . Held (x) 6k 29 32 20 59 30 Stand (j) 30 76 72 38 39 92 R epl i cat 1 on 3 • Yield (x) 36 30 2k 18 60 >+7 Stand (y) 7* .83. 82 7-8 78 78 Replication k. Yield (x) 36 32 2k 19 39 30 Stand (y) •57 6l 73 78 32 38 Yield Totals 212 122 101 80 208 2U7 Stand Totals d. ju 268 283 327 289 31+2 Calculate the regression of yield on stand. CHAPTER XV DESIGN OF SIMPLE FIELD EXPERIMENTS I. Criticisms of Agronomic Experiments There are about 2300 agronomic projects in force in the different state, experiment stations, "besides those carried on by the U.S. Department of Agriculture, and those in related fields. In fact, two-thirds of all agricultural experimental projects in this country are agonomic. They have increased in number "by 50 per cent since 1920. Frequently, this experimental work is criticised "by farmers and others. The criticise may or may not be justified. Agriculture is sometimes looked upon as a "practical" field in which results are sought rather than knowledge concerning the phenomena of life. At other times, there is a genuine shortcoming in experimentation. Allen (1930) states that fully one-half of the agronomic experimental projects consist of tests and trials of different kinds. Very littie ingenuity is involved in many of them. Variety and cultural experiments are popular while many genetic studies are merely field selection. Soil fertility experiments are often shallow. In many cases, old methods of experimentation are used while in others the experiments are carried too long. A -- Easic Principles in Design II . Outline of Experimental Tests A review of literature on the subject should "be the first step in the plans for an experiment. This should "be followed "by a detailed outline in order to crystallize the ideas of the investigator on the subject. Recently, Fisher (1937) has shown that design of an experiment is inseparable from the statistical analysis of the data. Certain objectives must be kept in mind in all agricultural experiments. These may be enumerated as follows: (1) The tests should furnish a basis for recommendations to farmers; (2) They should furnish occular proof of the beneficial results attained; and lastly, (3) They should supply information on the fundamental causes of the phenomena which the results are expected to demonstrate. Several factors need to be considered in the outline of an experiment. These are well described by Allen (1930) : (1) It should be definite and limited in scope. (2) The problem should be subjected to competent persons for criticisms and sugges- tions. (3) Previous work on the subject should be familiar so that the investigator can start work where others left off. (h) Next, he should ascertain the data essen- tial to the problem and devise means to secure and analyze them. (5) Then it remains to test their applicability or sufficiency to the problem. Sometimes it is found that progress is dependent upon the advance in related sciences. III. Principle of the Extremes Results should be secured over a wide range on either side of the optimum. Staple - don (1931) bas made this statement: "I believe in all field experiments of a re- search nature we should go at each end far beyond what is deemed by practical men to be the economic limit." The situation may be illustrated in a rate of seeding test for wheat whore the optimum rate is approximately 5 pecks per acre. The pre- liminary test should include rates at regular intervals from the very lowest to a maximum well beyond the point where the optimum is expected to fall, e.g., -I67- 163 1 peck 5 P R cks 10 pecks minimum optimum maximum The size of interval in tests is determined "by the amount of land, facilities, character of the problem, or available finances. An increase in the size of the in- terval is justifiable as one goes from the optimum to either a minimum or to a maxi- mum. In the final test, it may he advisable to throw out the extremes ami conduct a precise test around the optimum rate. IV. Simple vs. Complex Expe riments Experiments may he classified into several kinds based on the number of factors studied at the same time. The formal experiment is sometimes preceded by a prelimi- nary test. (a) Preliminary Test s All preliminary experiments are necessarily empirical in nature. They give the investigator an opportunity to detect faulty technique, inadequate methods, etc. The final experiment can be planned to eliminate many of the shortcomings observed in the preliminary test. A survey is sometimes used, for a preliminary test. A further use of the preliminary experiment is to reduce the error in subsequent tests. (See Wishart and Sanders, 1935) • (b) Simple Experiments One thing is studied at a time in the simple experiment. All factors are kept constant or uniform, so far as possible, except the one under investigation. This is the classical method of experimentation, i.e., the essential conditions are varied only one at a time. R. A. Fisher has recently pointed out that this approach is inadequate for many research problems because the lavs of nature may be controlled and influenced by several variables. In his book on i: The Design of Experiments !I , Fisher (1937) makes this statement: "We are usually Ignorant which, out of innumer- able possible factors, may prove ultimately bo be the most important, though we may have strong presuppositions that some few of them are particularly worthy of study. We have usually no knowledge that any one factor will exert its effects independently of all others that can be varied, or that its effects are particularly simply related to variations in these factors". The simple experiment is justified when the time, material, or equipment are too limited to allow for attention on more than one narrow aspect of the problem. As an illustration of this type, an experiment can be set up to determine the best variety of sugar beets to grow. Another could be designed to determine the best fertilizers to apply, while a third separate experiment could be relegated to the best cultural practices. The simple experiment is the one most com- monly used by investigators. It is recommended to beginners because it is less in- volved . ( c ) Combination and Complex E xperiments More than one variable is studied at a time in combination experiments. Examples of some of the more simple experiments of this type are: (1) rate and date of planting tests, (2) the relation between time of planting and date of maturity, (3) depth and rate of planting, in relation to yield, (k) fertilizer tests, etc. Recently, the Rothamsted workers have advocated the complex experiment in which two or more treatments are studied in all possible combinations. Yates (1935) states complex experimentation is due primarily to R. A. Fisher who first suggested it in 1926. It is extensively practiced at Rothamsted and to a Lesser extent elsewhere. Fisher (1937) claims two advantages of the complex experiment (factorial arrangement) over experiments that involve single factors, viz., greater efficien cy and .greater compreh ensiveness . A further advantage is that a wider inductive basis for conclu- sions is available. As an example, a complex experiment could be set up to determine 169 the responses of several fertilizers and methods of land preparation. V. Replication As previously pointed out, soil heterogeneity is the principal source of error in the field experiment . It can he overcome theoretically "by replication which tends to diminish the experimental error as well as to provide for an estimate of the mag- nitude of such errors. Fisher (1931) gives a diagram to show these relationships: Replication II Random Distribution skill Local Control Validity of estimate of error Diminution of error (a) Relation to Soil Heterogeneity The decrease in the standard error of the mean of one variety or treatment is proportional to the square root of the number of replications. Some workers have argued that increased replication results in more heterogeneity due to the occupation of a larger land area with the result that a point will he reached "beyond which fur- ther replication will give no further increase in accuracy. Fisher (1931) points out that the experimental error is due only to the Irregularities within blocks and that this difficulty is not effective when different treatments are compared locally with- in relatively small pieces of land. The number of blocks or replicates makes no difference because the block effect may he removed by the experimental arrangement (e.g. randomized blocks and Latin squares). Large "blocks presents a problem in it- self. The situation of large blocks led Hayes (I923) to make the statement that, when a large number of strains are "being tested, it is necessary to use a large num- ber of replications to attain the same degree of accuracy as when a smaller number of strains are "being compared. Special designs are advisahle for tests of a large num- ber of varieties or treatments. (b) Duration of Tests Replication in time is a necessary consideration in experimental tests. Comparative results from various treatments or varieties are frequently modified or even reversed in different seasons in response to climatic and soil variations and to the prevalence of plant diseases, insects, and other pests. The American Society of Agronomy (1933) recommends the continuation of a field experiment over a number of years so as to give a random sample of such seasonal effects. As an illustration, seasonal variahility at the Hays (Kansas) substation is greater than that due to soil. Crop Variable Factor Acre Yield (Bu.) "Wheat Wheat Season Soil 18J+ 170 I p p The standard deviation due to soil end season would tie: s 1 -yl2~ + V" . T- ae re_ duction in seasonal variation would require a replication of the test over a greater number of years. Ordinarily , a minimum of 3 years should be required in a field ex- periment where a seasonal influence is important. VI . P 1 ot Arrangem ent s Each variety or treatment may he arranged either (i) in the same order in each repli- cate, or (2) entirely at random in each replicate. The former is called a systematic distribution while the latter is designated as a random arrangement. Until rather recently , systematic distributions have been generally used in field experiments. Random arrangements have been advocated by Fisher (19:31) (1937) and the Eothamsted workers who claim that randomization is necessary for a valid estimate of error. Re- gardless of the arrangement used; the various plots of a variety or treatment should be arranged so as to adequately sample the experimental area. This usually leads to certain restrictions on the arrangement. (a) Random Ar rangement s To justify random arrangements , Fisher (1931) states that uniformity trials have quite generally established the fact that soil fertility cannot be regarded as distributed at random but to seme extent systematically. As an average, nearby plots are known to be more alike than those farther apart. Moreover , soil fertility dis- tribution is seldom or never so systematic that it could be represented ~bj a single mathematical formula. As to the estimate of error, Goulden (1931) explains that it depends upon differences in plots treated alike. Such an estimate will be valid only when pairs of plots treated alike are not nearer together or farther apart than pairs of plots treated differently. The total variance is made up of differences between plots in both directions. When the differences between plots treated differently are reduced by any sort of systematic arrangement one must automatically increase the differences between plots treated alike, and vice versa, e.g. V (total variance) ~ A (plots treated alike) + B (plots treated differently) An alteration in either A or B will result in a similar alteration in the opposite direction. Systematic arrangements which attempt to distribute the plots of any one variety or treatment as widely as possible over the experimental area tend to reduce B and increase A. Thus, the real differences between varieties or treatments are reduced and the experimental error increased* An example of a random arrangement for 6 "varieties" in h replicates is as follows: Replicate I: 5-7-2 -lf-8-6 -3-1 Replicate II: 1-3-5-6-2-8.-7 -k Replicate III: l_C-2-3-p-7-6-k Replicate TV: 7 _k -2-1-8 -3-5-6 In practice, a set of random numbers such as those compiled by Tippett (1927) is use- ful to effect randomization of treatments or varieties. One may draw numbered, chips at random or shuffle cards to obtain a random arrangement. (b) Systemat ic Arra ngem ents A systematic arrangement is the repetition of the varieties in the. same order in each replicate. Correlation between adjacent varieties is likely under such arrangements. However, systematic arrangements may be more practical in some experi- ments. Certain advantages have been given "oj the advocates of systematic arrange- ments: (1) Simplicity. It facilitates planting, harvesting, and note-taking opera- • tions. (2) It provides adequate sampling of the soil, i.e., allows for "intelligent placement" of the various -varieties or treatments, (3) Varieties may be arranged in 171 the order of maturity so as to facilitate machine harvest of field plots, (k) It may- he desirahle to alternate dissimilar varieties ("bearded and "beardless) so that mechan- ical mixtures can he detected in subsequent years. Systematic arrangement may he effective in such cases. Thru the use of plots which provide for the elimination of plant competition effects, systematic distribution loses one of its most serious sources of systematic error. The plot scatter on the experimental area is a matter of simple repetition when the plots are all planted in- a single series, viz., Replicate I Replicate II Replicate III ABCLEFGH ABCDEFGH ABCDEFGH As a rule, all plots cannot he placed exactly in one series, i.e., there are either too few or too many. It is advisable to commence each block with a different variety, especially when there is a soil gradient in the same direction as the series. This eliminates the possibility that one variety will fall on the best soil in each block. For compact blocks, the knight's move (one down and two over) is a common arrangement to secure an adequate scatter, viz., Replicate Varieties A B C D S F G H G H A B C I I F EFGEAB D I II III (c) I nfluence of Arrangement on Error Few data are available to show the relative accuracy of systematic and random arrangements. The "Student" Fisher controversy in 1936 indicates that the problem has not been fully settled. In a comparison of diagonal with random arrangements, Tedin (1931) found that the degree of variability within 6 by 5 blocks was not in- fluenced by either arrangement in the estimate of error. However, he advised random arrangements for the highest degree of scientific accuracy. In studies from uniform- ity trials with rice, Pan (1955) concluded that, with a systematic arrangement of varieties, the deviations from mathematical expectation were too -great to be explained on the basis of random sampling. In a randomized arrangement, the number of differ- ences in yield between all possible comparisons of hypothetical varieties that fell within a range of 0.5 cr, 1 .0 <r, etc., were computed. Satisfactory agreement with mathematical expectation was obtained in two experiments, and poor agreement in one (P = les3 than 0,01). On the other hand, Odland and Garber (1928) obtained somewhat lower standard deviations from systematic arrangements than from the theoretical ran- dom arrangement. So far as small grains in nursery plots are concerned, Love and Craig (1938) found the relative yields to be about the same for systematic and random arrangements . VII. Error Control The differences between plots of a single treatment in a replicated experiment are due partly to experimental error and partly to the average differences between repli- cates. The variability between replicates is irrelevant to the experimental test when each variety or treatment occurs but once in a replicate. Therefore, the variance due to replicates or blocks is generally removed from the error. The pre- cision of the experiment becomes greater when a large amount of the total variability can be removed in this way. is- 172 The shape of plots and "blocks are also concerned in error control. Long narrow plots are preferable within the "block so long; as the blocks themselves approach a square in shape. The basic experimental designs are the randomized block and Latin square arrangements. (See Goulden, 1939) •' VIII. Randomized Blocks The randomized block test is the simplest type of experiment where satisfactory error- control is obtained. This type of design is extremely flexible and can be used for as many as 30 treatments. The principal restriction Ln this test is that the same treatment should fail only once in each "block, the treatments or varieties being arranged at random. The number of replicates or blocks depends somewhat upon the number of treatments included in a block and the degree of precision desired. It if preferable for 'the test area to be square in shape, altho this is not absolutely necessary. (a) Field arrang ement A field arrangement for 10 varieties in t blocks could be as follows: \V I 5 10 7 2 t 8 9 6 3 1 II 9 1823 10 3 7 6 t III 6 1 2 9 8 3 ■ ■ 10 5 . 7 t _IV 3 7_ _3_ _t _9_ 6_ _2 3_ _10_ _l_ For more than 30 varieties, special designs should be used. (See later chapters).. (b) Computation of Sums of Sq uares The yield data can be arranged conveniently for computation as in the table below. (Data from Goulden, 19-9) • Varieties Blocks 1 2 5 t 5 6 7 8 9 10 Totals I 34.0 16.0 jtvl it. 5 lo„5 29.9 28.6 16.0 17.3 23.I 232.0 II lt.0 11.0 -20.5 13o 13-6 28,2 27.6 8.3 12.1 29.9 I8O.3 III 26.6 9.0 29.3 7.9 13. ^ 25.3 23.3 3.6 8.1 22,6 171.5 IV 18.5 H.9 21.0 13.2 8.9 28.8 Lo.3 9-5 10.317.7138.3 Totals 93.I H7.9 lOt. 9 31.1 56, t 112. t 96.2 39. t t7.8 92.9 7t2.1 Means 23. 12 11. 98 26.22 12. 78 it. 10 28.10 2t. 03 9.85 11.95 23-22 18. 55 First, it is necessary to compute the sums of squares for octal., varieties, blocks (or replicates) , and error. The correction factor is (Sx) /itf, or (?t2.l) /to = 550, 712. ti /to = 13,767.81. Total = S(x 2 ) - (Sx) 2 = l6 ; 279.27 - 13,767.81 = 2511. t6 N Varieties = S(x v 2 ) - (Sx) 2 = 62 /lit .0 1 - 13,767.81 * 1760.69 n N t In this case, it is necessary to square the total for each variety and divide \rj the number of values that make up each total to reduce the results to a single-plot basis. v A set of random numbers such as table 6 in the appendix is useful to randomize the varieties. In fact,, columns I, III, V, and VII were used. 173 Blocks ■-S(ft b g ) - (Sx) 2 = l4o,803.25 - 13,767.81 = 312. 51 m K 10 Error = Total - (varieties + blocks) = 25U.W - (1760.69 + 312.51) = 438.26 The data are assembled in a convenient table as follows : Variation Sums Mean due to D.F. Squares Square s F -value • Blocks 3 312.51 104.17 6.1+2** Varieties 9 1760.69 195.63 12.05** Error 27 438.26 16.23 4.029 Total 39 . 2511.46 **Exceeds 1.0 per cent point, i.e., the vaxue of "F" which has a probability of 0,01 of occurring due to chance. F = larger variance = 195.63 = 12 . 05 smaller variance 16.23 By reference to the F -table, it is observed that the obtained F-value exceeds the 1.0 per cent point in both cases. The other computations are as follows: Standard error of a single determination (s) = V16.23 = 4.029 Standard error of the mean for each variety (o^) = s//n = 4.029 /V~^~= 2.0143 Standard error of a difference (cr d ) = 05^/2" = (2 .0143) (1 .l4l4) = 2.8486 Level of significance for 5 pet. point = (cr d )(t) (for 27 d.f .) = (2. 8486) (2. 052) = 5-8453 In this case, 2.052 times the standard error of the difference gives odds of 19:1. This value can be obtained from the "t" table by Fisher (1934) where "t" is taken for the degrees of freedom for error at the 5 per cent point . (c) Ap plication to Mean Comparison s Tests are sometimes found in which the value of z or F, for the comparison of variances due to varieties and error, just fails to reach the 5 per cent level of significance. This would indicate that the differences between variety means were of doubtful significance. In spite of this, certain differences between variety means can often be found which exceed twice the standard error. The use of twice the standard error (which gives approximately edds of 19:1 as the degrees of freedom approach 60) would indicate that certain differences might be sign?f icant . However, in such cases the testimony of the "z" or "F" test should be accepted as correct. Twice the standard error is net a sufficiently stringent test for the comparison of the greatest yield difference found in a large set of possible differences. Student (1927) and Tlppett (1937) have both pointed out that, when the highest and lowest values are compared, the conventional use of twice the standard error to obtain odds approximately equivalent to the 5 per cent level of significance is no longer valid. For example, with 10 varieties in the test the difference between the highest and 17*1 lowest varieties would need to reach 3 .2 times the standard error to lie on the 5 per cent level of significance. On the other hand, when !! F" is determined signifi- cant, the practice of using twice the standard error of the difference of two means as a criterion for significance may "be too stringent when the means under considera- tion are contiguous in an arrangement of the variety (or treatment) means in order of magnitude. IX . The Latin Squa re The Latin square design is very efficient whore a small number of varieties or treatments is heing tested; hut It "becomes unwieldy for more than 1.0. Two restric- tions are imposed on the treatments In this design, i.e., the same treatment can occur only once in 'the same row or column. The treatments are arranged at random within these restrictions. The limitation of the Latin square for a large number of varieties is due to the requirement of the same number of replications as treatments, It should he emphasized that the plots need not he square in shape. (See Fisher and Wis hart, 1930). This design gives error control across the field- In two directions, which always takes care of soil gradients, The most generally used Latin squares vary from k by k to 10 by 10. Some data from an irrigation study with sugar "beets will be used as an illustration of the Latin square arrangement in the field as well as for the statistical analysis. (s.) Field Pl ot Arrangement The field lay-out for the 3 irrigation treatments (A,B,C,B, and E) was as follows ; Columns 1 1 2 3 k 5 • E D A B d. C E B A B Rows 3 A C B . E D k D B E C . A of the B A C B Tf> Analysis Data The data for the irrigation stud;/ are compiled below, followed by the static- t i cal analy s is . Tons I >ee1 ;s Per Acre Row Row 1 o 7 i (• 5 Totals 1 18. 32 i'E) 19 .ko ( T » 20. 66 (A) 22 63 (B) 18 .65 (c) 99 .97 2 20. 68 (c) Ik • 29 (B) 18. 32 (B) 20. 02 (A) 20 .58 (B) 9'+ 39 3 26 ok (A) IT M (c) 21. 06 (B) 18 91 (s) 20 .03 (B) 103 53 k 22 31 (B) dd • 93 (B) 17". 15 (E) 17 lk (G) 20 .62 (A) 100 ko 5 2k kk (B) 20 .2 L ; (A) lo. 92 (c) 19 73 (B) lk .07 (E) 97 kl Column Totals 112. 19 ,9k •kl 96. 61 98 kd 93 95 14-95 70 Treatment A B G B I Totals 107. 59 111 ■Ik. 92.88 100 . 35 82. ?k Means 22. 35 23 .52 20.11 18. 38 16. 59 175 Correction factor = (Sx) 2 /n = (495.70) 2 /25 = 9828.7596 Total = S(x 2 ) - (Sx) 2 = 10,007.8598 - 9.328.7596 = 179.1202 S Rows = S(x r 2 ) - (Sx^ = 9,858.1604 - 9,828.7596 = 9.4208 n K Columns = S(x c 2 ) - (Sx) 2 = 9,875-9164 - 9,828.7596 = 4?.1768 n N Treatments = S(x fc 2 ) - (Sx) 2 = 9,955-4952 - 9,823.7596 = IO6.7556 n N Error = Total - (Rows + columns + treatments) = 179.1202 - (9.4208 + 45.1768 + 106.7556) = 17.7670 The data are assembled to complete the analysis: Variation due to D.F. Sums . Squares Mean Square Standard Error F- Actual -Value Jfo Point Rows Columns Treatments Error 4 4 4 12 9.4208 45.1768 106.7556 17.7670 2.5552 11.2942 26.6889 1.4806 1.2168 1.59 7.65 18.05 5.26 5.26 5.26 Total F = larger 24 variance = 26 179.1202 .6889 = 18.05 smaller variance 1.48o6 Since the computed F-value is greater than that for the 5 V er cent point, significant differences exist "between treatments. The other constants may be computed as follows: Standard error of the mean (cr^) = s = 1.2168 = 0.5440 tons. Standard error of the difference (a d ) = o^^lT = 0.5440 -/2~ = O.77 Level of significance (for 5 pet. point) = 2.179 ©a = ( 2 .179) (0-77) = 1-68 tons. The data may "be arranged as follows in summary form: Treatment Mean Yield (tons) A 22.55 B 21.52 C 20.11 D 18.58 E 16.59 Standard Error of the Mean 0.544 Level of Significance (5$ point) 1.68 176 B -- Rel ation of Type of Experi ment to Design- % m Variety a nd Similar Teats The variety test Is probably the most common type of agronomic field experiment. Crop varieties are bested for yield in moat crop improvement programs to determine which ones are superior under given soil and climatic conditions. Varieties are known to differ as to the "best rate of planting. Less seed is required under dry lane than under irrigated conditions due to the moisture factor. Car let on (1909) points out that winter wheats tiller more than spring wheats and; when winter -hardy, may be sown, at a thinner rate. It is not always possible to overcome the objection of differential response of varieties to different rates of seeding in a -variety test. It is usually safer to use a somewhat higher rate than that recommended to farmers because variations due to unexpected causes will then have less effect. Rate and date tests are sometimes combined wiith variety trials, or they may be con- ducted separately. The combined test permits a study of differential -variety response to different rates or dates. A rather wide range of rates on either side of the optimum is suggested for rate of planting tests in order to determine the point of maximum yield. A. test to determine the moat satisfactory dates for planting crops is usually an exploratory stage in field experimentation to secure this information for certain environmental conditions. Such tests are usually planted at a regular mterval 01 tremi fil^r It 5 between dates from extremely early in the planting season to ex- )ue to occasional differential varietal response to time of ■planting consideration should be given to the question of planting a variety series at several different dates. Ex perimenti the : ;-;ned with p to 10 varieties, while varieties. For greater numbe; should be investigated. XI. Crop Sot at ion Experiments as Latin squares for small precise tests randomized blocks are commonly used l'or 10 to JO s in a single experiment , incomplete block designs esidual effects is a study of r to '^row all crops used in the rotation each year in order to In crop rotations;, or other experiments in wh made, it is neeessaip obtain reliable results. Carieton (19Q9) early called attention to the fact that this simple but essential matter had been entirely overlooked in many of the older experiments. For accuracy in a rotation aeries , every stage or crop mu every condition. Each year there must be as many plot in the rotation. For example, in (2) red clover, (p) corn, and (h) experience are crops or stages as the.ro k-jear rotation of (1) oats seeded to red clover, there must be four plots. The plots must 5X133', ox over, {3 ) corn, an be at least in duplicate in order to allow for the removal of soul variability, quale replication is the greatest need in crop rotation experiments. In another block in the same test there may be a plot of each crop, in continuous culture, al- though this is not always necessary. The crop rotation test, must be concreted over a period of years so that the crop yields will be definitely influenced by the dif- ferent rotation treatments. Such a test might be laid out as follows: i+-year Rotation Replicate I Replicate II (a) Red Clover (b) Corn Barley (c) (d) Corn Oats (e) (f) Continuous Culture ••-;- -ye ar Rot at i on Oats it) Corn (0) Barley Corn (g) (c) Barley Oats (d) (a) iiture ar I ev (s) del Clev (b) 177 To compare this 4-year rotation -with a 3-year rotation, it would, he necessary to wait 12 years. For a 7 and 5-year rotation, the results could he compared at the end. of 35 years, etc. XII. Cultural Experiment s Cultural experiments include such tests as fall vs. spring plowing, methods of seed- bed preparation, surface vs. furrow planting, etc. Field plots are generally neces- sary for experiments of this type "because of the use of farm machinery. Many dryland- experiments are concerned with cultural methods. The same procedures for variety tests are generally satisfactory in tests of this kind. XIII. Fertilizer Experiments The most reliable information on the fertilizer needs of soils may he obtained from the field, experiment. Nutrient solutions and sand cultures are used in special studies. The early fertility experiments at Rothamsted were concerned primarily with the fertilizer value of certain mineral fertilizers as shown by increased crop yields. The present long-time fertilizer experiments are concerned more with comparisons of similar fertilizers, effects on crop plants, and efficiency of fertilizer practices. The earlier workers often tested one fertilizer at a time, but many present workers are inclined to favor more comprehensive tests, i.e., the inclusion of several fer- tilizers at more than one level. Most investigators use crop yield as the major criterion n£ fertilizer response. (a) General Types of Fertilizer Tests Fertilizer Tests may be conducted for several definite purposes. (I) Defi - ciency of Fertilizer Elements in a Field : Results of such tests are applicable only to the field tested or, at most, to soil of similar type with similar previous cul- tural treatment. It is strictly applicable for the test year, since the crop grown may modify conditions for the next season. (2) Efficiency of single Fertilizer Elements ; For this type of test it is desirable to have the fertilizer elements tested in minimum. (See Giles, 191*0. Several rates of a standard fertilizer can be compared with one or more rates of a fertilizer that carries the elements in a differ- ent form. Equal rates of each fertilizer can be compared also. (3) C omparat iv e Methods of Application ; This type includes tests on depth of placement, time of application, placed to side vs. with the seed, etc. (h) Optimum Fertilizer Bala nce: This type is concerned with fertilizer balance for various crops. It involves many complications when made in the field because it is difficult to control or even measure the fertilizer balance in a field. In such a study it is necessary to esti- mate by chemical tests the amount of the fertilizer elements furnished by the soil as well as the amount applied. Probably the most practical method to make such a study is to vary each element separately over a wide range qj several rates of application. The regression of yield on amount of the element available (amount in plant plus amount applied) may then be calculated. A further complication would be to test all possible combinations of several fertilizers at different levels. The triangle sys- tem suggested by Schreiner and Skinner (1918) may be useful for the computation of all possible combinations of three fertilizer elements (say P2O5J NH3; an & KgO) at several levels. This triangle system should not be used as a basis lor the field lay-out as originally advocated. Such a test should be designed as a factorial ex- periment. (See Chapter 19) . (5) Long -Time Effect of Fe rtilizers: Such tests with various forms of fertilizers are concerned with the physical and chemical properties of soils as well as soil productivity. (b) Design of Fertilizer Experiments Several basic principles should be considered in soil fertility experiments. A soil profile to a depth of 3 feet is highly desirable for each series of plots. L/O Before soil treatment experiments are "begun, the American Society of Agronomy (1953) re commends that "representative samples of the soil and subsoil should b ) carefully taken for such analyses as may he desired for future reference". In the matter of plot design the Society cautions that "the lateral translocation of soil or ferti- lizer beyond the plot interspaces of soil experiments should, be avoided . Since the manner of fertilizer application may .affect yields materially, due consid- eration should be given to this problem". For fertilizer tests wh.;re 2 or more fertilizers are applied, at 2 or mere levels, the factorial design is suitable. The factorial experiment , explained by Fisher (193*0 <s Yates (1933)* Summerby (1937) and others, involves all combinations of the fertilizers and levels (or amounts) of application. The study of interactions is an important consideration in such an experiment. For example, suppose a ferti- lizer test is to be conducted with nitrogen,, phosphorus, and potassium at two differ- ent rates each. The rates can be designated by subscripts so as to give the 8 possi- ble treatment variants as follows: I P a EC , %? E 0J NqP^ K o P %, %?iK , %?,.%, h,?^ arid 1^%. Siren an experiment can be planned for a randomized block test o:c for some form of the Incomplete block test. G-oulden (193*0 gives some suggestions or. the design of mere complicated fertilizer experiments. Residual effects duo to past fertility treatments is discussed by Forester (1937). XIV , Pasture Expe riments In experimental pasture work, the investigator may desire to: (I) determine the amount of herbage produced on an area by different pasture -grass mixtures, (2) to find out the influence of fertilizers on pastures as to yield and. survival of the palatable species, or (>) he may desire to measure the influence of different grazing methods on yield and survival. Replication of treatments is vital in any case. One of the important technique problems is the comparative results from grazing and mechanical harvest of herbage. Stariedon (iQpl) states that the. animal is the master factor in pasture studies. He tethered sheep on small plots and moved thorn twice a day in the Aberwystwyth pasture researches. Certain advantages are claimed for his tethering method: (l) replicated plots are possible; (2) the experimental sheep are handled and. examined- twice per day; (3) grazing will be uniform, ana (k) the animal capacity is increased per unit area. Schuster (1929) recommends at least h replica- tions and 3 animals per plot in pasture investigations. The use of grazing method.s permits the effects of trampling on the vegetation te be measured. Pasture plots maj be harvested mechanically, i.e., clipped, with a mower or with shears.. Brown (1929) advocates the us^ of grass shears for small cages, .the lawn mower for grass less than 6 inches high, and a mowing machine for tailor herbage. Several studies have compared grazing and mechanical harvest of pasture plots.. Brown (192-9) found that the herbage of grazed and mowed, plots varied markedly in time due to animal pre- ferences. Animals void a large proportion of the fertilizer elements consumed in feeds, particularly nitrogen and phosphorus. Thus, mowed pastures may be low in fer- tilizer elements when compared with grazed pastures. A high correlation between mowed and grazed yields was found when the mowed areas were changed te the previously grazed, areas every two or three years. Continuously clipped cages have yielded less than annually mowed cages. Robinson, et ai (1937) ^ r und. a progressive decrease in the yields of clipped permanent quadrats in relation to grazed areas. Sampling methods are often involved in the design of pasture experiments. See Chap- ter 16 for further details en this abase, as well as the report of v'inall and others (193V). 179 C -- Incomplete Experimental Recor ded XV. Missing Values in Experiments In general, replicated field experiments are so arranged that the mean yield for all plots that receive a given treatment provides the "best estimate of the effects of that treatment. Sometimes the yields of some plots are lost or prove unreliable with the result that the orthogonality of the original design disappears. Since th.4 treat- ment, "block, etc., effects are computed from the total yield of all plots in a given treatment, "block, etc., it is necessary to interpolate the yield of the missing plot in order to use the ordinary analysis of variance. Allan and Wishart (1930) were the first to provide formulae for the estimation of the yield for a single missing plot in either randomized "block or Latin square tests. They arrived at their formula by the procedure of fitting constants by least squares. Yates (1933) used a simpler solution by minimizing the error variance obtained when unknowns are substituted, for the missing yields. The two formulae give the same results, but the one by Yates also provided a method appropriate for the estimation of the yields of several missing values. His formula is used here. XVI. Calculation of Single Missing Valu e A single missing value can be calculated for either a randomized block or latin square test. (a) Randomized Block Te st Some data are given on the effect of date of planting on the yields of sugar beets in which a plot value is missing. The yields are in tons per acre. Date Block Number Planted 1 2 3 k 3 Total 3 Sarly 22.3 21.8 19.7 21.2 Medium 13.3 iQ.k 18.5 21.5 Liat e 17.2 17.2 17-9 (18.8) Very Late 14.9 12.6 13.1 Ik.k 20.0 105.0 17 • 3 9^ • 16.7 (87.8) 69.0 12.1+ 67.U Totals 72.7 70.0 69.2 (75.9) 66. k (35^.2) 335. 4 ; It is assumed that the yield of the late-planted plot in block k is missing. The sums for block k and. for late planting are given below or to the right of the appro- priate block or treatment to show that they are the sums of only the known plots . The values in brackets are filled in later. - The formula for the estimation of yield of this value in a randomized block test is as follows: x = mM + m'M'- Tx - -.*-----.- r --------------- r - - (l) (m - 1) (_• - 1) where x = yield of missing plot, m a number of treatments m'= number of blocks _^ M = sum of known yield.s of treatment with missing plot M'= sum of known yields of block with missing plot T x = total yield of known plots. vThis portion is taken entirely from an. outline prepared by Dr. F. R. Immer, U. of Minnesota. 180 Ip the sugar beet test used as an example, x B H &9-Q) ± %(2h ±} ~ 335. ^ = 18.8 (i - Dl5 - 1) The yield, x - 18.8, is inserted in the table after which the block, treatment, and general sum are corrected accordingly. These figures are in the brackets. The analysis of variance will be computed in the usual way, except that the degrees of freedom for error and total have been reduced one. The degrees of freedom must be reduced by one for each plot value interpolated. The analysis of variance is as follows: Variation Degrees Sums Mean Standard F -Value due to Freedom Squares Square Error (s) Observed 5 pet. point Blocks k 13.043 3.2608 3.71 3.36 Treatments 3 lU9 . 6l8 Ho. 8727 56.69 3-59 Error 11 9-677 0.8797 0.9379 Total 18 I72.338 (b) Latin Sq ue re Te st The formula to be used for the interpolation of a single value in a latin square test is as follows: x = m(M r + M c + M+.) - 2 T x ( 2 ) ' (m-1) (m-2) Where x = mis sing plot yield; M r , M c , Mj. = totals of known yield? of the row, column, and treatment from which the plot is missing; m - number of treatments (also equals number rows or columns); Tv = total yield of ail known plots, XVII . M ore than One Missi ng Value A method of approximation may be used, for irore than one missing plot yield. Three plots are missing in the randomized block trial given below: Pate Planted i_ p -7. h 3 Total Early Medium Late Very Lat e 22.3 10.3 17,2 14.9 21.8 (18.6) 17.2 12 . 6 (21,2) 18,5 17.9 13.1 T 21.3 (18.7) 20 . D 17.3 16.7 12.4 (106.5) ( 9 ] +.2) ( 87.7) 67. 4 83.5 73.6 69.O Totals 72.7 (7C2) (70.7) (73-8) 66.4 (355.8) 207 51.6 i+9.5 57.I The plot yields given in brackets have been assumed to be missing. As it is possible to interpolate the yield, of only one plot at a bime, one must assume yields for all missing plots except the one to be interpolated. First, suppose the medium planting 111 block 2 is interpolated. Per the early plot in block 3 and the late plot in block 4, one must insert the mean yield of the known plots for those two dates, or 21.3 an l 17.2 t on s , r e s p e c t i v e 1 y . 181 The formula for interpolation, in which have been substituted the values for the first approximation for the yield of the medium planting in "block 2, is as follows: x = mM + m'M'-Tx = M75<6) ± 'JL2L&1 '" 335-8 = 18.7 (m - l)(m'-l) (k - 1) (5 - 1) The same procedure can "be followed for the early planting in "block 3> except that the guess of 21.3 used "before should "be removed. The interpolated value (18.7) is used for the medium planting in block 2, and the guessed value (17.2) for the late plant- ing in "block h. The grand total is corrected accordingly. The value of x in this case is 21. 3. In like manner the yield of the late planting in "block k is interpolated. This is found to "be 18.7. Since it was necessary to estimate the yields of two plots in order to start the interpolation process, the values obtained will be somewhat in error. Therefore, the values are re -interpolated, using the values obtained by the first interpolation for all but the plot yield being calculated. This is repeated until no further changes take place. The values obtained in this case were as follows: ' . Approximations Treatment Block 1st 2nd 3rd Medium 2 18.7 18.6 .3.8.6 Early 3 21. 3 21.2 21.2 Late k 18.7 18.7 .18.7 The interpolated values did not change after the second approximation. The interpolated yields are inserted in the above table (as shown in brackets) after which the correct treatment and block totals are determined. The analysis of var- iance is then computed as shown below: Variat ion due to Degrees Freedom Suras Squares Mean Square Standard Error (s) F -Value Obtained % Point Blocks Treatments Error k 3 9 11.923 I6O.306 8.309 2.9808 53. ^353 0.9232 O.9608 3.23 57.88 3.63 3.86 Totals 16 180.538 Three degrees of freedom have been subtracted from error and from total because three plot yields were interpolated. XVIII. ' Tests of Significance The error calculated from analyses of variance, in which one or more plot values have been interpolated, is a valid estimate of experimental error wnen the degrees of freedom have been reduced by one for each value interpolated. However, • the variance due to treatments is not entirely without bias, being always higher than it should be. The significance of the test is accentuated, but the correction Tor this condi- tion is quite trivial for cases in which only a single value is missing. The bias is more pronounced where many plots are missing. Tests of significance by means of the analysis of variance are generally all bhat are required. For a single missing plot, the treatment mean with the estimated value of the missing plot will have an error as follows for a randomized block test: 182 °xt = 1 m' m (m -■ l)(m' - 1) (3) Where m' = number of blocks, m = number of treatments, and s^ = variance of a single plot calculated from error. The variance of the treatment mean would he s<-/m' where no plot was missing. For a single missing plot in a latin square, the variance of the treatment mean with the missinp: value would he as follows: m 1 + m | o 2 ------------ ----- - (h) (m - l)(m - 2) J For rnore then one missing plot, these formulae are strictly applicable only to com- parisons between means where one contains no missing plot. To find the variance of the difference between two means, both of which contain missing values, is rather difficult. Ref eren ces 1. Allen, E. W. Initiating and Executing Agronomic Research. Jour. Am. Soc. Agron., 22:3^1-3^8. 1930. 2. Allen, F. E., and Wishart, J. A Method of Estimating the Yield of a Missing Plot in Field Experiments. Jour. Agr. Sci., 20:399-^06. 1930. 3. Brown, B. A. Technic.in Pasture Research. Jour. Am. Soc. Agron., 29:U68-U76. 1957. k. Carleton, M. A. Limitation in Field Experiments . Soc. Prom. Agr. Sci., pp. 55-61. 1909. 5. Fisher, R. A. The Technique of Field Experiments. Fothamsted Conf., 15:11-13- 1931. 6. Statistical Methods for Research Workers. Oliver and Boyd. 5th Ed. pp. 199-251. 193V. ?■• Design of Experiments, Oliver and Boyd. 2nd Ed. vo . 15-100. 1937. 8. , The Half -Drill Strip System of Agricultural Experiments. Nature, 158:1101. 1956. 9. _ j and Wishart, J. The Arrangement of Field Experiments and the Statistical Reduction of the Results. Imp. Bur. Soil Sci., Tech. Comm. No. 10. I.95O. 10. Forester, E. C. Design of Agronomic Experiments for Plots Differentiated in Fertility by Past Treatments. la. Agr. Exp. Sta. Res. Bui. 22o. 1937. 11. G-iles, P. L. On the Plans of Fertilizer Experiments. Jour. Am. Soc. Agron., 12. Goulden, C. H. Modern Methods of Field Experimentation. Sci. Agr., 11:681-701. 1931. 13- , Statistical Methods in Agronomic Research. Can. Seed Growers Assn. 1929. 14. Methods of Statistical Analysis* John Wiley, pp. ^5-51? and 11+2-11+8. 1959. 15. Hayes, E. K. Controlling Experimental Error in Nursery Trials. Jour. Am. Soc. Agron., 15:177-192-. I923. 16. Love, H . II. and Craig, W. T. Investigations in Plot Technic with Small Grains. Cornell Memoir 214. 1958. 17. Odland, T. E., and Garber, R.J. Size of Plat and Number of Replications in Field Experiments with Soybeans. Jour, Am. Soc. Agron., 20:95-108* 1928. 18. Pan, C. Uniformity Trials with Rice. Jour. Am. Soc. Agron., 27:279-285. 1933. 185 19. Paterson, D. D. Statistical Technique in Agricultural Research. McGraw-Hill. pp. 156-188. 1939. 20. Robinson, R. R., Pierre, W. H., and Acker-man, R. A. A Comparison of Grazing and Clipping for Determining the Response of Permanent Pastures to Fertilization. Jour. Am. Soc. Agron. 29:3^9-359. 1937 . 21. Schreiner, 0., and Skinner, J. J. The Triangle System of Fertilizer Experiments. Jour. Am. Soc. Agron., 10: 225-2^6. 1918. 22. Schuster, G. L. Methods of Research in Pasture Investigations. Jour. Am. Soc. Agron., 21:666-673. 1929. 23. Standards for the Conduct and Interpretation of Field and Lysimeter Experiments. Jour. Am. Soc. Agron., 25:803-828. 1933 . 2k. Stapledon, R. G. The Technique of Grassland Experiments. Rothamsted Conf . 13, pp. 22-28. 1931. 25. Student. Errors of Routine Analysis. Biometrika, 5:351' 19 2 7. 26. The Half -Drill Strip System of Agricultural Experiments. Nature, I38: 971-972. 1936. 27. Summerly, R. The Use of the Analysis of Variance in Soil and Fertilizer Experi- ments with a Particular Reference to Interactions. Sci. Agr., 17:302-311. 1937. 28. Tedin, 0. Influence of Systematic Arrangement upon the Estimate of Error in Field Experiments. Jour. Agr. Sci., 21:191-208. 1931. 29. Tippett, L. H. C. Tracts for Computers XV --Random Sampling Numbers. Cambridge U. Press. 1927. 30. Tippett, L. H. C. The Methods of Statistics.' Williams and Norgate. 2nd Ed. pp. 125-139. 1937. 31. Vinall, H. N., et al. Report of the Joint Committee on Pasture Research. Am. Soc. Agron. (mimeographed) 193^ • 32. Wishart, J., and Sanders, H. G. Principles and Practice of Field Experimentation. Emp. Cotton Growing Corp., pp. 60-85. 1935. 33. Yates, F. The Analysis of Replicated Experiments when the Field Results are Incomplete. Emp. Jour. Exp. Agr., 2: 129-1^2. 1933- 3k. Complex Experiments. Suppi. Jour. Roy. Stat. Soc, 2:181-2^7. 1935. Questions for Discussion 1. What criticisms have been made of agronomic experiments in general? Are they justified? 2. What justification, is there for the antagonism sometimes found between scientific theory and practical facts? 3. What are the principal objectives in agricultural experiments? k. What factors should be considered in the outline of an experiment? 5. What is the principle of the extremes? Illustrate. 6. In laying out field experiments in which one variable is continuous, what prin- ciple or rule should be followed with respect to the extremes? 7. Distinguish between preliminary and permanent experiments. 8. What is a simple experiment? Its limitations? Advantages? 9. What are combination or complex experiments? .Are they desirable? Why? 10. What sources of variation or error, other than that due to soil or season, may occur in field experiments? 11. How does a random arrangement differ from a systematic arrangement? 12. Is soil heterogeneity systematic or random? Explain. 13. Upon what is the estimate of error based? How influenced by a systematic plot arrangement ? Ik. What are the advantages usually given for systematic arrangement? Random arrangement ? 184 20 21 22 15. What is meant "by the "knight's move"? 16. Discuss the relative efficiency of systematic and random plot arrangements. 17. What is a randomized block test? What restrictions are imposed? Its limitations? 18. What is the Latin square arrangement of plots? What is the primary objective in this arrangement? 19. What conditions should be observed in planning variety tests? What is a check variety? What precautions are necessary in crop rotation tests? What are the limitations in fertilizer tests? In rotation and soil treatment tests what should be the treatment of the check? 23. What is the law of the minimum? Its application to fertilizer tests? 24. What is a factorial experiment? Give an example. 25. Discuss grazing vs . mechanical harvest of herbage, 26. Why is it necessary to calculate missing values for the analysis of variance to apply? . ., 27. How are the degrees of freedom modified when a missing; value is computed? Problem s 1. Different amounts of fertilizer were-: applied to sugar beets by the Colorado Experi- ment Station in 193° (Data from D. W. Robertson) in a randomized block trial. The yields in pounds of sugar per plot for various amounts of treble superphosphate applied per acre were as follows: Phosphate Treatment None 100 lbs. 200 lbs. 300 lbs. Totals Block I II 343 I85. 358 413 393 .435 427 ' 468 III 208 483 463 487 Total 730 1256 1291 1382 1321 150" 10 4l £<*; 466 (a) Compute the analysis of variance for a randomized block experiment (b) Determine significance by use of the "F" test. (c) Compare the average .yields the no treatment and 200 lb. treatment by means of the standard error. A rate of planting test with sugar beets was conducted in 1931 by H. E. Brewbaker, The rates used were: 15, 20, 23, and 30 lbs. per acre. The experiment was de- signed as a 4 by 4 Latin square, the data for which follow: Tons beets per (3) 16.73 (4) 17.74 (1) 17.52 (2) 18.21 ■ ore ~liTT (2) 17.2c 38 (3) 00 'I o ,13 •53 Column Totals 70.20 70.26 Row totals ( ] 10.35 (2) 15.27 63.73 (3) 18.83 (1) 16.94 70 . 71 (2) 17.97 (*0 18.31 71.95 (1) 17-7 1 !- (3) 16.61 72.09 70.89 67.13 278.48 (a) Compute the analysis of variance. (b) Obtain "F" for a comparison of error with rows, columns, and treatments. (c) Test the significance of 'F' : . (d) Continue the analysis and compute the standard error of the mean, standard error of a difference, and level of significance in case it Is justified. 185 3- Design a crop rotation experiment to show the effects of a legume in a rotation. The rotations are as follows: (a) fcarley (seeded to alfalfa), alfalfa, alfalfa, corn, and sugar heets; and (t>) "barley, corn, and sugar beets. k. Four varieties of wheat were grown in I93O in a randomized block trial in 5 blocks The yield of one plot was lost, (a) Calculate the. yield of the missing plot, 'b) Complete the analysis of variance. Block Variety Kanred 5k.h in .7 52.1 56.1 61.0 Cheyenne ko.7 1^6.5 59-9 53-7 Tenmarq. 61.7 51.7 ^3-5 61.9 58.7 Hays No. 2 55.5 50.6 61.9 1*5.1 72. k 5. The same k varieties were grown in a randomized "block test in 1937. The records on 2 plots were lost. Calculate the missing values and complete the analysis of variance . Block Variety 5 Kanred 5^.6 53.7 68.0 55.2 58.5 62.1 Cheyenne 66.3 60.9 6^.8 67.6 __-._ 66.2 Tenmarq. 58.5 57-5 kk.i 65.6 52.9 51.6 Hays No. 2 57.3 60.5 62.2 58.8 5^-3 CHAPTER XVI QUADRAT AED OTHER SAMPLING METHODS !• Sam pling ; in Agr onomic Work There* are times when it is impractical to lise the whole plot or plant population to obtain a numerical determination of some characteristic of the experimental material. In such cases as tiller number, yield? percentage dry matter, nitrogen or sugar in the crop, it is mere practical to sample only a proportion of the whole. To quote Wishart and Sanders (1955) : "The object is to obtain as close an estimate as we can of the measure, which would, be obtained accurately, within the limits of experimental error, had the produce of the whole plot been counted, weighed, or analyzed." The sample must be representative and taken in such a manner as to assure that end. It is also necessary to take into account the further source of error due to the sampling- process. Yields determined by sampling procedure arc not determined as accurately as when the entire plot is taken, but it is often advantageous to sacrifice some accur- acy to save labor, II. Theory of Sampling The sampling distributions so far considered have been based on the assumption of independence. The simple theory of errors does not apply when the variation is heter- ogenous and the extent to which the sources of variation are represented is not left to chance. It has been shown in a randomized block trial that the variance due to error is an unbiased estimate of the error variance of the infinite population from which the data under consideration are a sample. The other items in the mean square column (blocks and varieties) are not unbiased estimates of the respective variances of the population. In fact, they contain the variance due to error as the degrees of freedom become indefinitely largo. For example;, the estimated variance due to varieties, in the theory of large samples, is made up of the true variance due to varieties plus the variance due to error. This becomes important in statistics of estimation as shown by Tippett (1957), Immor (1932, 1956), and others. Suppose some data on protein in relation to different rate -of -planting treatments in corn for 1931 be used to Illustrate the computations; Method Kate Prote: .n Per cent per Sample v/ Planted Planted B3 .ock I Block 11 Block III Variety (i) (2) (1) (2) (1) (2) Totals Golden Glow Hills o 10.357' 10.4o8 10A25 10. 522 10 . 1 00 10.043 oj. . op_; :l it a 9-525 9.422 9.228 9.3^2 9.667 9.543 11 ii •s 8.995 3.903 9'. 325 9.211 v < . -vd. . 9.479 35^15 Pride North 3 IO.363 10.351 9.713 9-553 9.627 59.212 n ii 4 0.171 9-2^5 9.399 9.576 9.057 9.052 <".<=; -\r\ e :l |1 5 9 . loo 9 . 120 9.171 9.211 8.527 8.504 53.690 Golden Glow Drills 12^ IO.072 ic. 038 10.528 io.4o8 IO.38O 10.438 61.714 ■M it 9 9.750 9 •"'-/? 9.696 9.559 9-^33 9.4.51 57.474- ii ii o 8.8^6 3.778 a. old 3.oo4 9.143 9.080 93.512 ii H 3 8.482 8 . 590 9Jl79 9.4-22 go 1 ^ 9.002 55.02p Pride Worth 1 O JL.w 9.872 9.929 10.009 9.384 9.827 0.724 l 59.7^i '■ H H 9 9.325 9A6B 8.853 3.892 9.H4 0. 14° o4.8oo ■I ,i 6 O '-vO-"i 0. ;Oy 3.761 9.523 9.365 8.832 3.802 34.384 .1 ;l 3 8.64.1 8.767 9.260 9,428 3.455 8.510 33.067 Totals ■ .31.323 3 .31,459 1 .33.236 133.237 131.123 131.o5q 791.432 vprotein = N (nitrogen) x 3«7 ^ Plants per hill N^ Inches between plants in row -186- 187 The analysis of variance is set forth for the experiment in which two protein deter- minations were made on the shelled corn per plot. For simplicity, treatments will he considered without regard to variety or method of planting. The calculations for the sums of squares are as follows, the two samples per plot being added together for the plot determinations: S(x) = 791^52 (Sx) 2 /N = 7,14.56.7216 S(x s ) 2 - (Sx) 2 /N = 7,^80.8645 - 7,1+56.7216 = 24.11+27 S(x ) 2 - (Sx) 2 /N = 14, 961.1+1+26/2 - 7,456.7216 = 25.9997 P S(xJ 2 - (Sx) 2 /n = 208,799.0186/28 - 7,456.7216 = 0.5362 S(x t ) 2 - (S X ) 2 /N = 44,852.7957/6 - 7,456.721b = 18.7444 The summary for the analysis of variance is as follows: Variation due to Degrees Freedom Sums Squares Mean Square Standard Error F -Value Blocks Treatments Error 2 15 26 0.5362 . 13.7444 4.8691 0.1931 1.4419 O.1875 0.1+528 1.05 7/7O** Total for Plots 41 25.9997 Samples within Plots "~Ts 0.1430 0.0054 O.O583 Total samples 85 24.1427 In the simple case where one sample is drawn from each plot with the treatment repli- cated for m plots, the variance of a treatment mean is V-^/m, where V 2 , the mean variance between plots approaches a 2 , the true variance of an individual plot as m approaches infinity. However, when n samples are drawn from each plot, the variance of a treatment mean is Vp 2 /mn, where V p 2 /n estimates a 2 the true variance of an individual plot plus the true variance of an individual plot moan or a s /n. This follows because a plot mean is now subject to variation due to more than one sample. It is evident that o- a 2 is the true variance of an individual sample taken from a plot. The relationship may be shown as follows: Y c cr + a ran m s mn _ i m (a n (1) It should also be noted that -*- cr (2) It is clear that ex 2 can be estimated from the above formula because V 2 and v| are both obtainable from the analysis of variance. In the present experiment, V 2 = O.I875, V 2 = 0.0054, n = 2, and m = 5. P s Mil + JL (o- 2 vl + c r^ ) , and 0.0054 n cr 188 Therefore , 0,1873 v 1 ( cr ^ + oXlp> ) or . 0919 _ __ . cr 2 . 6 5 2 ' ■ p The standard errors for plot and sample means are then calculated. V 0.091 9 = 0.303s ..-..-.> or,, and /:>.003 J ; = 0.0583 > ct s . The ratio, cc/crg is estimated as 0. 3032/0,0383 = 3-20. This indicates that the variation between plots greatly exceeds that within plots or between samples, being 5.20 times as great. Ill, E c o nomy i n _ S amp 1 ing It Is of considerable importance to analyze hew the precision of ar, experiment, as measured inversely . by i/m (cr'f: + c /n) 1" affected by varying m, the actual plot p s replications in the field, ana n> the number of sample;:; drawn from a plot. The rnosx. important inference to be drawn is that the precision is mainly controlled 'by m, the number of plot replications. Increasing the number of samples taken from the differ- ent plots can only appreciable affect the precision when o§ is not relatively small v j. i- j. .to o as compared with cr- . In the orosent problem 0.0034, the estimated value of o~'~ is "0 6 to ■ ■ small compared with 0.0919s the estimated value of o " : . Hence, It must be concluded ' • j o that to make more than one analysis on a. sample from a plot was unwarranted by the small gain that would result, ( a ) C oiiiput at i on of Kuaib e r of Sampl e §_ or Eeplicate a Thw required variance of the mean of a treatment (K) would be: K = 1 (cr 2 f 0- 2 ) ' - - ■- - (3) — n ' n For the data in this problem, K = 1_ (0.0919 + 0.003^ ) -*0.0312 "3 2 '" The computation, for different values of m or n, will give the number of replications and number of samples per plot that will be necessary to reduce the variance of the mean to a given level, i.e., K - 0.03X2. The X values for the estimation of the variance for treatments are as follows when the number of analyses per sample are varied for three replications: Humber of Samples (n) ^ril :i -lill! ! _ll ilJ r! ' z. a. 1 O.0318 2 0.0312 3 0.0310 k '0.0309 3 O.0307 These data indicate the negligible effect when 1, 2, 3; ^? or 5 protein analyses are laade from shelled corn camples. ( "a ) Determination of M in imum E xp en s e Technical difficulties often prevent plot replication beyond a certain degree In such cases it Is frequently worthwhile to strengthen the precision of the experi- ment by drawing replicate samples from the different plots. The number that should be drawn depends on several factors. These factors are: (l) The variation between plots as measured by oi- in relation to o~|, and (2) the cost of growing a plot ar 139 compared with the cost of obtaining and analyzing replicate samples per plot. The time factor instead of the cost factor, or the combination of the two, should bo con- sidered in many types of experiments. It is proposed to investigate how these relative costs determine a balance between plot replicates (m) and. sample replicates (n) in order that a stated precision for an experiment may be obtained at a minimum expense. Let C represent the cost per plot replicated, and c the cost per sample replicate in the conduct of an experiment. For a given treatment the total cost of plot replications will be mC, while the total cost of sample replications will be mnc. Hence, E^ the total expense per treatment, is given by: E mC mnc w A pertain criterion of precision to be obtained may be represented by K = l/m (cr^ + cri/n), where K = the required variance of the mean for a treatment. In order to^re- duce (4) to an equation with one variable, it is found that m = l/K (ct 2 a value which is substituted. a' /n), Then, E = l/K (cr 2 a 2 /n)(C + nc) To reduce the total cost to a minimum, differentiate E with respect to n, and set the equation to equal zero, viz., dE dn K (a 2 + o- a 2 /n)c + (C + nc)(-n" 2 o~ s 2 ) = co~ 2 + P 2 c o~ s n - C cr n s - COrS = n n 2 ca| = Co" 2 n 2 = G o- 2 C cr or n = C cr c a Thus, the total cost will be a minimum when, r£= C a 2 (?) c cr In this case, n, and hence m, are determined to afford a most economical design. It is worthwhile to note that n is determined to be independent of K, the precision desired. In the present experiment, the values are substituted in the above equation (5)j> viz n = 2, o- 2 = 0.0919, and 0% = 0.003^. The ratio of costs will be: C = 0.0919 x 2 2 = 108.12 c 0.003*+ From the standpoint of expense, the analysis of a duplicate sample from each plot would have been justifiable to produce the most economical design only if the cost per additional plot had been 108 times the cost per analysis. 190 IV. Sampling, Practices The important practical consideration in campling is that sufficient units should he taken to give a reasonably accurate representation of the whole i In sampling processes a small representative amount of the material io analyzed. For field plots. Wishart and Sanders (1955) advise that the samples should amount to") per cent of the plot at the very least. It must be recognized that the use of the entire plot is the most reliable where it is feasible, as sampling can afford only an estimate of the plot yield. Yates (1935) found yl per cent loss of information as an average • of several experiments where sampling technic was employed. Some of the practices for different crop material are discussed be. low. To determine sampling errors it is necessary to draw at least two independent samples . (a) Quadrat M ethods Some form of quadrat is generally used for sampling yield trials, or for the detailed study of vegetation. It was early pointed out by McCall (19-1?) that, while the harvest of the entire plot is most satisfactory in yield trials, it is attended with difficulties that make it practically impossible for plots away from the main station. A form of quadrat was suggested as a. solution. The quadrat may be linear for a certain length of row, often being unite of one foot, one yard, or one rod In length. The type of quadrat most frequently used in range and pasture experiments is a square area, usually a square motor or yard. The different kinds of area quad- rats are described oj Weaver and Clements (1929) • 1- Small Grains : The rod-row unit is widely used by American investigators to har- vest small ' grain plots by sampling methods. The English workers prefer one -foot or one-meter lengths of drill row. The use of the rod-row, to secure a yield estimate for the entiro plot, was studied by Amy and G-arbor (1919). They harvested 9, 5, ana k rod-row samples from l/lO- acre plots of wheat and oats, and subsequently the entire plot. They concluded that increases over the mean yield of the checks of 15.70 per cent for triplicate 1/10- aere plots, 9 -'+9 per cent for the nine rod rows, 12,73 P er cent for the five rod rows, and 14.M-!- per cent for the four rod rows were (on- the average) probably signi- ficant. Nine rods removed from l/lO-aqre plots were concluded to give practically as accurate yield determinations as for the harvest of the entire plot,. It was admitted that the amount of labor required to remove nine rod rows was about the same as for the harvest of the entire plot. In studies with wheat and barley, Clapham (1929) ob- tained a standard error of less than 6 per cent for the yield estimate whore 30 one- meter-longth drill rows were harvested from l/jO-acre plots. The red row method was compared with meter lengths with six sets of five contiguous meter lengths. In bar- ley, the standard error for 30 met or -lengths was 5.99 P^r cent, while the standard error calculated from sets (rod rows) was 7*20 per cent. The use of the square yard, in addition to the rod rev, has been studied by various workers. Kiesselbach (I.917) determined the yield on lk entire l/^O-acre plots com- pared to 20 areas 32x32 inches (quadrat areas). It was concluded that 20 systemati- cally distributed areas may be safely substituted for the yield of the entire plot. Amy and Sfeinmetz (1919) concluded that four or five systematically distributed square- yard areas removed, from l/lO-acre plots gave approximately the same error for yield as harvesting the entire plot. 2. Pota toes and Other Hill Crops: Yields of potatoes at Eothamsted were analyzed both by sampling and by harvesting the entire plot. There were ].80 plants en each plot. Wishart and Clapham (1Q29) reasoned that the individual plant was the logical unit rather than a 'metrical unit. The actual number of plants necessary to determine 191 the plot yield depended upon the uniformity of the crop and the plot size, hut rare- ly was less than 10. A one-in-10 sample was inadequate for plots l/90-acre in size, it being necessary to take every third or fourth plant on a plot that size to give an error of four per cent. It was concluded that there was little to gain "by the use of sampling methods on plots less than l/20-acre in size. It was better to harvest the entire plot for yield. A pattern method has been used to sample sugar beet plots at harvest time, the individual beet being the unit. Wishart and Sanders (1935) make this explanation: "Suppose that there are 200 beets in a plot, and that it is pro- posed to take two sampling anits of 10 beets each. Two numbers between 1 and 20 (inclusive) are drawn--say k and lo. The plot is then covered by walking along one row, back on the next, and so on, and the ifth, 24th, kkth beet pulled for one sampling unit, and the l6th, 56th, 56th beet for the other sampling unit, totals only being recorded in each case." Small samples of 10 roots for sugar analysis were criticized by Johnson (1929) . Five samples of 10 roots from l/^O-acre plots gave a difference as high as 2.2 per cent sugar. He concluded the results were unreliable within one per cent of sugar each way. For 50-beet samples taken in groups of 10 he reduced the estimate of significance to 0.53^ P er cent and 0.593 V er cent sugar in two experiment s . 3. Pastures : The area quadrat has been widely used in pasture investigations, the usual size being a square meter. For the determination of the abundance and frequency of species in range pastures, Hanson (193^) concluded that a quadrat two-meters in size was desirable. Stewart and Hutchings (195^) have suggested the Point -Observa- tion-Plot method for vegetation surveys. These plots are 100 square feet in arca ; being marked off Idj a circle 5. 6k feet in radius. This method is claimed to be more rapid than the ordinary quadrat method. It is also suitable for statistical analysis. The vertical point method has been advocated recently by Tinney et al. (1937) • Two horizontal pipes are mounted on legs 1.2 inches high, with a linear row of 10 holes spaced two inches apart through which needles Ik inches long are moved up and down. The point is pushed down until it touches a plant or a bare spot. The number of time's a species is hit per 100 readings (needles) is expressed directly in per cent. The needle may hit a number of different species, e.g., bluegrass '32, timothy 30, redtop 12, red clover 3> snd bare space 8. A modification is the inclined point method. Others who have published on quadrats for pastures are Brown (193 7 )* an<i Fobinson, et al (1937). (b) Sampling from Bulk Material So far sampling has dealt with plots during growth and harvest. Sampling of harvested produce, or bulk material, is another important form found- in experimenta- tion. In laboratory determinations, a sample of the material is mixed, and one or two sub-samples taken. A good method is to mix the heap of material thoroughly, to divide it into four quarters and to reject, for example, the N.S. and S .\>J . quarters mixing the other two again. The process is repeated until the bulk is reduced to the size required for a sample. 1. Protein Determinations : Duplicate samples were taken in 2*+l cars of wheat by Coleman, et al (1926) to determine the accuracy of sampling for protein. The cars were sampled twice in 5 different areas with a gram probe. The contents of both samples were composited separately and each reduced to 73 grams in size. Over 96 per cent of the tests varied less than 0.25 per cent in protein. In a study of sam- ple size the error was found to be less than 0.10 per cent when the samples weighed 30 grains or more. The error was higher for smaller samples. Bartlett and Greenhill (I936) and Leonard and Clark (1936) found that one protein determination per repli- cate reduced the error more rapidly when the number of replicates was increased than by the use of duplicate laboratory analyses. The latter workers found the cost ratio of plot replicate to sample replicate for protein determinations in corn. The analy- sis of a duplicate sample from each plot would have boon justified in producing the 192 most economical design only if the cost per plot had been 108 times the; cost per analysis. 2. Shr inkage Samples in R e- rage: Two or three samples per plot were found by Wilkins and Inland (193^) to accurately measure the water content of forage on individual plcts of alfalfa and red clover. Samples 2 to k- "pound's in size were considered best. 3. Purity and Germinatio n Tests in Seeds : Tests for germination and purity in seed analyses are affected "by personal, sample, and random ei~rcrs . Standard rules (I927) for seed testing specify minimum sample s3z.es as follows: (1) Two ounces of grass seeds; (2) Five ounces of red or crimson clover ; alfalfa, rye grasses, breme grasses, mi.ll.et, flax, rape, or seeds of similar size; (3) one pound of cereal, vetches, or seeds of similar size. Collins (1929) -has set forth the procedure for statistical anal3'ses of purity and germination tests . References 1. Anonymous. Rules for Seed Testing, Dejpt . Cir. ko6 } U.S.I. A. 1937- 2. Amy, A. C., and Garter, R. .J. Field Technic in Determining Yields of Plots of Grain "by the Bod Row Method. Jour:, Am. Sue. Agron., 11:33-47. 1919. 3. Amy, A. C, and Steimaetz, F. H. Field Technic in Determining Yields of Experi- mental Plots by the Square Yard Method. Jour. Am. Soc. Agron., 11:81-106. 1919 . k. Bartlett, M. S., and Greenhiil, A. V, The Relative Importance of Plot Yariat.u.n .and of Field and Laboratory Sampling Errors in Small plot Pasture Productivity Experiments. Jour. Agr. Sol., 238-262. 193o. p. Brown, B. A, Tochnic in Pasture Research. Jour. Am. Soc. Agron., 29:^-68-^76. 1937. 6. Clapham, A. P. The Estimation of Yield of Cereals by Sampling Methods. Jour. Agr. Sci., 19:214-233. I929, 7. Coleman, P. A., et al. Testing Wheat for Protein with a Recommended Method for Making the Test, Bept. Bui. 1460. U.S.D.A. I926. 8. Collins, G. P. The Application of Statistical Methods to Seed Testing, Cir. 79- U.S.D.A. 1929. 9. Fisher, S. A. Statistical Methods for Research Workers, Oliver and Boyd. (Vth edition). 1933 • 10. Hanson, H. C. Size of List Quadrat for Use in Determining Effects of Different Systems of Grazing Upon Agropyron Smithii Mixed Prairie, Jour. Agr. Res., kl:3!+9-360. 11. Immer, F. R. Sampling Technic. Mimeographed outline, U. of Minnesota. 193 c » 12. Immer, P.P. A Study of Sampling Technic with Sugar Beets. Jour. Agr. Res,, ^:633-oV7. I.932, 13. Immer, F. P., and LeClerg, P. L. Errors of Routine Analysis for Percentage of Sucrose and Apparent Purity Coefficient with Sugar Beets taken from Field Ex- periments. Jour. Agr. Pes., 92: oG^-PIi- 1936; Ik. Johnson, S. T. A Pete on the Sampling of the Sugar Boot, Jour. Agr. Sc, 19:311-315. 1929. 13. Kiessellach, T. A. Studies Concerning the Elimination of Experimental Error in Comparative Crop Tests, Nebraska Research Bui. 13. 1917- 16. Leonard, ¥. H.., and Clark, A. Protein Content of Corn as Influenced by Labora- tory Analyses and Field Replication. Colo. Exp. Sta. Tech. Bui. .19 « 1936. 17. McCall, A. G. A Pew Method of Harvest ine: Small Grain and Grass Plots. Jour. Am. Soc. Agron., 9:138-lik0. I9I7. 18. Robinson, 'P. P., Pierre, W. E., end Ackerman, P. A. A Comparison of Grazing .and Clipping for' Determining the Response of P; remanent Pastures to Fertilization. Jour. Am. Soc. Agron., 29: 3-1-9 -339. 1937- 195 19. Stewart, G., and Hutchings, S. S. The Point -Observation -Plot (Square-Foot - Density) Method of Vegetation Survey. Jour. Am. Soc. Agron., 28:714-722. 1936. 20. Tinney, F. W., Aamodt, 0. S., and Ahlgren, H. L. Preliminary Report of a Study on Methods used in Botanical Analyses of Pasture Swards. Jour. Am. Soc. Agron., 29:835-3^0. 1937. . 21. Tippett, L. H. C. The Methods of Statistics. Williams and Norgate, pp. 125-130, and 20^-226. 1937- 22. Weaver, J. E., and Clements, F. E. Plant Ecology, pp. lO-lj-2. 1929. 23. Wilkins, F. S., and Hyland, H. L. The Significance and Technique of Dry Matter Determinations in Yield Tests of Alfalfa and Red Clover. la. Agr. Exp. Sta. Res. Bui. 2^0. I938. 2k. Wishart, J., and Clapham, A. P. A Study of Sampling Technic: The Effect of Artificial Fertilizers on the Yield, of Potatoes, Jour. Agr. Sci., 19:600-6l8. 1929. 25. Wishart, J., and Sanders, E.G. Principles and Practice of Yield Trials. Empire Cotton Growing Corp., pp. k2-k5, and 85-97. 193&. 26. Yates, F. ; Some Examples of Biased Sampling. Ann. Eugenics, 6:202-213. 1935. 27. Yates, F., and Zacopanay, I. The Estimation of the Efficiency of Sampling with Special Reference to Sampling for yield in Cereal Experiments. Jour. Agr. Sci., 25:5^5-577. 1935. Questions for Discussion 1. Under what conditions may it be desirable to take samples rather than use all the available material? 2. Are yields determined from samples as accurate as when the entire plot is har- vested for yield? Why? 3. In a randomized block trial what makes up the variance due to varieties? k. When can an increase in the number of samples effect an appreciable increase in precision? 5. What were the conclusions in the shelled com data on the number of samples for protein analysis? 6. What factors influence the number of samples that should be drawn so far as the precision of the experiment is concerned? 7- Explain how to determine the most economical design from the standpoint of cost. 8. Under what conditions are quadrat methods used for small grain harvest? How many quadrats are usually advised for l/lC-acre plots? 9. Compare the rod, meter-length, and area quadrats for small grains. 10. What is the logical sampling unit for potatoes or sugar beets? How many samples should be taken? 11. Describe how to use a pattern method of sampling for sugar beets and similar crops . 12. Describe the different quadrat methods used in pasture studies. 13. Give sampling precautions and technic to use in bulk material. \k. What errors may be introduced in seed testing? Give the sample sizes generally used for analyses. Problems Four varieties of crested wheat grass were grown in a randomized block trial in plots l/80-acre in size. Four replicated plots of each variety were grown, the yield data being obtained from six quadrats per plot. The yields in pounds per square yard sam- ple are given below. (Data from Dr. T. M. Stevenson, U. of Saskatchewan). 194 Yields of Crested Wheat Grass Square Yard Block Mecca:S-,l CW.G; :S -10 CW.G. :S-11 CW.G. :Uns (No.) (No.) (lbs.) (lbs.) (lbs.) (lbs.) 1 I 0.52 0.68 0.48 0.58 2 oJi-9 0.62 0.55 O.58 3 0.59 0.70 0.46 0.61 4 O.36 . 70 . 58 O.63 P G.28 0.62 0.51 O.65 6 G.49 0.66 0.38 O.71 I IT 0-.61 O.77 0.44 0.68 2 0.49 0.91 0.48 0.43 3 O.52 0.89 0.U9 0.75 4 0.56 0.95 0.61 O.71 5 0.57 . 77 . 58 0.65 6 0.49 0.77 0.4l 0.68 1 III . 52 0.Q;: 0.27 0.42 o 0.42 O.77 0.61 . 51 5 0.66 0.46 0.44 . 58 4 0,57 o.3i 0.51 0.54 o . 59 0.58 0.61 . 66 6 0.56 0.35 0.4l 0.58 T, 17 0.42 0.70 0.55 . JO 2 0.51 0.37 0.72 0.30 3 0.V7 O.53 0.63 0.44 4 . '50 0.60 0.65 0.66 ir 0.55 . 64 . 48 0.63 6 .26 ' 0.33 O06 o.4i . 1. Calculate the analysis of variance for the crested wheat grass yields for a sub- division of i within plots division of the total variation into blocks , varieties, error , and square yards 2. Compare the variance due to samples and that due to replications. Make a state- ment on the number of samples and number of plot replicates that you would recom- mend in a subsequent experiment . 3. Determine the most economical design from the data at hand, i.e., the ratio of replicate cost to sample cost. uH CHAPTER XVII COMPLEX EXPERIMENTS ^/ I . Use of Complex Experiments The present trend in field experiments is toward somewhat complicated designs which permit the study of several factors in one large-scale comprehensive experiment. There are several advantages of the complex experiment. (1) One that includes sev- eral treatments in all possible combinations permits a broad basis for generalization due to the fact that the interactions, as well as the main effects, can be studied. It is obvious that field experimental results may be influenced materially by en- vironmental conditions with the result that a combination of factors may provide a more satisfactory answer to the problems undor study. (2) The degrees of freedom for error variance are higher than would be the case for single experiments designed to study each factor separately. This leads to greater precision in the results. (See Paterson, 1939). The value of a complex experiment depends upon a careful analysis of the problem and the various treatment combinations to be tested. The amount of complexity introduced depends also upon the facilities and funds available. It is a safe precaution to key-out the degrees of freedom for the various factors to be tested before field work begins to be sure that the proposed plan is satisfactory. After the data are collected the investigator should make certain that the data are sufficiently homo- genous to combine in a single test. (See paragraph VII). H. Application to a Barley Variety Trial In agronomic tests of cereal crop varieties it is often desirable to conduct the trials at various points in the area under consideration and to carry them on for a period of years. Some data collected by Immer and others (193*0 on the yield of barley varieties tested in randomized blocks in h locations in Minnesota for a 2 -year period will be used to illustrate the method of computation. The data are based on 6 square -yard samples harvested from each plot of approximately l/^O-acre each. Each test consisted of 3 randomized blocks. The same 5 varieties were tested at Univer- sity Farm, Waseca, Crookston, and Grand Rapids for the years 1932 and 1935. The yields in bushels per acre for each plot of each variety are given below in Table 1. H* From Dr. F. R. Immer, University of Minnesota, with minor modifications. -195- 196 Table 1. Yields of five varieties of barley, replicated 3 times in each of 4 loca- tions in 1932 and 1935. Block Number Tot. for I IT III Tot. I II III Tot. "both years Univ. P arm - 1932 Univ. Farm - 1935 Manchuria 19-7 31. 4 29.6 80 . 7 45.5 50.3 60.0 155.8 236 . 5 Glabron 28.6 38.3 •43.5 110. h 47.5 4.1.4 49.4 138.O 248,4 Velvet 20.3 27.5 32.6 80. k 54 . 2 52.3 64.5 171.0 251 .. 4 Wis. #38 27.9 4o.o 46.1 114 . 62.2 53.1 74 • 7 190.0 304.0 Peatland 22.3 30.8 31.1 ■ 84.2 47.4. 57.8 50.5 155.7 239.9 Total 118.8 168.0 182.9 469-7 256.3 254.6 299.1 810,3 1230.2 Manchuria Glabron Velvet Wis. #38 Peatland. 40.8 44.4 44.6 39.8 71.5 Wase ca 29.4 34 . 9 41.4 3Q.2 47. < 1932, 30.2 35 • 9 26.2 29.1 55-4 Total 241.1 .192.5 174 100 , 4 113.2 112,2 103.1 174-5 608.4 53-9 63-7 5". 9 74.2 51.1 Waseca - I.935 58.8 47.7 52,2 56 . 4 67.O 45.O 61 . 1 59-1 75-6 47.3 160 . 4 177.0 169.4 216.3 143 . 4 260.8 290.2 281.6 324.0 317,-9 296.8 301. a 268.3 867.0 14754 Manchuria Glabron Velvet Wis, #38 Peatland Total 34,7 23.8 29.8 2.7.7 43.O 164.0 Crooks ton 1932 2o.l 28.7 38.4 27 ..6 7,0 '7 .■■ c - •_ I 156.5 35.1 2]. . 28.0 20 . 4 32.0 136.5 98.0 73.5 96.2 75.7 107. 7 4-57,0 rooks 1 :on - 1 935 42.1 47.1 30.3 120 218.9 38.8 29 . 4 30-5 93 7 172.2 42.1 4 . 39-3 121 21.8.1 44,3 43.5 47 • 7 135 5 211.2 53.9 51 . 8 50 . 3 156 263.7 21.2 211 . 8 199.1 632 1 1089.1 Man churls Glabron Velvet" Wis. #38 Peatland Total 20.2 13.2 24,5 19.O 27.6 Grand, Ra.pids ~16~.< 30,2 20,5 41.6 18.4 30 . 9.6 30 Cr+ - O OO 7 1932 104.5 140.7 103.5 ) 4 26 Grand Rapids £> •■'-' 26.5 32-7 43 3 21 h : 18.7 24.1 96 7 1 20 7- 26 . 8 30 . 4 62 20. 7 23-6 30.9 30 3 32- 6 4o.O 34 . 2 .43 7 122. 135.6 152.3 - 1 9" 85.8 152, 64.2 77 • 9 85*2 106.8 409.9 107.5 174.6 137.2 187.1 '58 p < Total 4 Stations 628.4 657-7 597-7 1883.0 8Q6 903.9 913.8 2719.5 4603.3 III. Analysis of Tes Into Corrroonents The analysis of a complex experiment of this type is merely an extension of the analysis of variance as applied to the randomized block test. The various factors t of ether with their degree of freedom may be represented as follows: It is noted that all block x variety- 1 - interactions are included in error, 1 The symbol (x) in this connection denotes interaction, 197 Variation due to: Degrees of Freedom Slocks Stations Years Variet ies Interact' Lons of Variet iei 3 x Stations it x Years (i x Stations x Years Stations x Years Blocks X Stations ii X Years »i X Stations x Years it X Varieties ii X Varieties x Stations ii X Varieties x Years it X Varieties x Stations x Years ) ) Srror ) ) 2 3 1 4 12 k 12 3 6 2 6 8 ) 24 ) 8 ) 24 ) 64 Total 119 There will be a total of 119 degrees of freedom for the combined test since there are 120 plots. The degrees of freedom for the main effects will be one less than the number of blocks , stations, years , and varieties , respectively. The degrees of free- dom for interaction is obtained by the multiplication of the degrees of freedom for the variables involved. For example, varieties x stations will be (4) (3) = 12. For the second order interaction, varieties x stations x years, the degrees of freedom will be (4)(3)(l) = 12. After the degrees of freedom are keyed-out, the remainder of the computation must be made in accordance with this plan. IV. Computation of the Sums of Squares The correction factor, (Sx) 2 /N, is computed for the total yield of the 120 plots, i.e (1+603. 3 2 /l20 = 176,586.4241. This factor will be used for the entire test. For the total sum of squares, the 120 individual plot yields are squared and summed. This value (Sx 2 ) is equal to 200, 879-35. The correction factor is then- subtracted, viz., 200,879.35 - 176,586.4241 = 24,292.9259 XIII. ^ The data for the remainder of the computations are grouped from table 1 into tables, each with two variables. The sums of squares are computed the same as for a simple randomized block test. It is to be noted that totals are used for the variable con- cerned. For this reason, the sums of squares obtained must be divided by the number of basic plots included in the respective totals (to reduce the variables to a single plot basis) in order for the common correction factor to apply. The combined data for a comparison of varieties and stations are given in table 2. TThe roman numeral refers to the line in Table 8. 198 Table 2. Total Yields grouped for Varieties and Stations Barley Variety Total U.Farm 1280.2 Station Waseca Crookston Grand Sapid s Total Manchuria 236-5 260.8 Glabron 2-+8.1+ 290 . 2 Velvet 251 . h 2.81.6 Wis. #38 30U . 32^.9 Pea tl and 239.9 317.9 IkiJ.k 2.1.8.9 177-2 218.1 211.2 263-7 1089.1 152.2 868. h 107.5 323 . 3 ljk.6 925.7 137.2 977.3 187.1 1008.6 758.6 il-603.3 These data are taken directly from the right-hand column of table 1 of the sums of squares is carried out as follows: The computation Total (X 2 V J - (Sx)2 = 1,129,020.73 176,586.^1 6 N 6 = 183,170-1216 - 176,536-^2lil = 11,583,6975 Varieties = S(x 2 v ) - jSx) 2 =■ h, 261, 251.19 - 176,586*^2^1 2k N 2\v = 177,5521329 - 176,586-i+2'M = 965-7088 Stations * 3(x 2 s) - (3x) 2 = 5, 577, 329*97 - 176,586-^2'+! 30 30 = 185,910.9990 - 176.586-1+2^1 = 9,52^.57^9. Varieties x Stations = Total - (Varieties + Stations) - 11 ..583.6975 - 10,290,2337 = 1.293.1+153 The values for varieties, stations, and varieties x station total are included in table 7, where the steps for computation are indicated. The interaction values are included in table '8. The sums of squares for the other factors are computed in a similar manner, the data, being given in Tables 3> K, 5? ' :n 'l 6, with the results included in fable 'J. In table 3 d^e given the data for comparisons of varieties and years, the yields at the four stations being totaled. Table 3. Total yields grouped r 'or varieties and years. Zear Variety 1932 1935 Total Manchuria 3I46 . h roc n 863.4 Glabron 3I45 . k )lT7 Q 4 ! ( • ~> 825-3 Velvet 335.5 5^0 . 2 925 . 7 Wis. #38 3 C )0,. 3 617-5 977-3 Peatland kke.i 561 . 9 1008 = 6 Total 1883.8 2719.5 1+603,3 199 In table 4 are assembled the data for comparisons of blocks and stations, the block totals for the two years of each station being added. Total Table 4. Total yields of blocks and stations Stations Block U. Farm Waseca Crookston Grand Rapids Total I II III 375-6 422.6 482.0 537-9 494.4 443.1 385.2 368.3 355-6 226.5 276.5 255.8 1525.2 1561.6 1516.5 1280.2 1475.4 1089.1 758.6 4603.3 In table 5 are the totals for comparison of blocks and years. This table is assem- bled from the totals at the bottom of table 1. Table 5- Total yields of blocks and years Block I II III Year 1932 1955 628.4 657.7 597-7 896.8 903.9 918.8 Total 1525.? 1561.6 1516.5 Total 1885.8 2719.5 One other table is necessary, that of stations and years, 4603-3 Total 1280.2 Table 6- Total yields of stations and years Station Year U. Farm Waseca Crookston Grand Rapids Total 1952 1955 469-7 310 . 5 608.4 867.0 457.O 632.1 348.7 409.9 I883.8 2719.5 1475.4 10 39.1 758. 4603 . 3 The calculation of the sums of squares for the complete analysis can be performed vith the least difficulty and confusion when the steps are carried thru in a routine manner. Many of the calculations are given in table 7. The remainder follow easily and logically, 200 Table 'J. Calculation of sums of squares. Total of Calc. No. of Single _ Sum of Key Varlate Squares from Varia- plots in (Sx) Squares to table bles each tot. N table squared squared -3 4 5 ---• 1*4 6 ' 5-7 S (x 2 ) 200,879.35 1 120 1 2OO.879.35OO 176.586.^1 2Jf,292.0259 ■ III S (x|) 5,577,329.97 2 k 30 185,910.9990 " . 9;324.57 i +9 II S (x 2 y ) 10,9^,382.69 3 2 60 I82,li06.3782 " '• 5 : ,819. 95^1 HI S (x 2 v ) ^, 261, 25I.I9 2 5 2fc 177,552.1329 " 965.7086 IV s (x\) 7,064,601.85 4 3 ko 176,615-0462 " 28.6221 I s (x 2 sy ) 2,897.377.01 6 8 15 193,156.4673 " 16,572,0432 S (x 2 J 1,129,020.73 3 20 6 138,170.1216 " 11,533.6975 To S (x 2 , rv ) 2,206,627-61 3 10 12 183,885. 63^2 " 7-299.2101 S (x 2 :, q ) 1,871,824.37 4 12 10 l87,lo2.i+370 " 10,596.0129 S (>: 2 bv ) 3,650,180.03 5 6 20 i82,509.00]5 " 5 ; 922. 5774 B(x% aJ ) 5^3,855.03 1 h0 3 197,951.6767 ,! 21,365-2526 S(x 2 b3y ) 97^., 322. 9^ 1 24 5 194,864,5860 " 18,278.1619 A notation found to be very convenient in practice is to let S(x L - J be the sum of the squares of the individual plots. The station totals are designated x g , the variety totals x v , etc. The totals for varieties at the separate stations are designated x.< rp . varieties in different years by x-^r, etc. These are given in table 7. In column 1 of this table are the sums of the squares of the varlate s concerned and in column 2 is given the tabic from which they havj been computed. Thus, S-(x^) is calculated Iron the 4 station totals of table 3- S(x5) is calculated Prom the variety totals of table 2. The value of S(x 2 s ) is the sum of the rjquarcs of the 20 yields .for each variety at each station separately in table 2. Column 3 of table '( gives the number of figures squared under column 1. Column '-' gives the number of single plots contained in each figure squared. Column 5 is simply columns 1 divided by 4. This is necessary to reduce the sums of square- s to a single plot basis throughout. The key numbers refer to that sum of squares in the complete analysis of variance given as table 8. The sums of squares for total, blocks, stations,, years and varieties are transferred directly from table 7 "to table 8. The interaction sums of squares can be obtained from table 7 by subtraction. The- sum of squares for Interaction of varieties x stations will be found by subtrac- tion of the sum of squares for varieties and stations separately from the suma of squares opposite S(x 2 o ) in table ?■ e.g. 201 11,583.6975 (19 D.F. ) - 965.7088 ( k P.P. for varieties) IV -9,324. 5749 ( 5 D-F. for stations) --II 1,293.^138 (12 D.F. for varieties x stations)- V Since there were 20 figures used to obtain S(x^ ) there would be 19 degrees of free- dom. The interaction degrees of freedom are obtained by subtraction, e.g., 19 - (4 + 3) = 12. All otner first order interactions are obtained in the same manner. The second oraer interaction of varieties x stations x years is obtained by subtrac- tion from the sum of squares opposite S(x2 va ) in table 7 the sums of squares for varieties, stations and years separately and their first order interactions in all possible combinations. Thus: 21,365.2526 -965.7088 -9,32^.57^9 -5,819.95^1 -1,293.^138 -513.5^72 -1,427-5142 (39 D.F.) ( 4 D.F. for varieties) ,-— IV ( 3 D.F. for stations) . II ( 1 D.F. for years)- -III (12 D.F. for varieties x stations) V ( 4 D.F. for varieties x years) VI ( 3 D.F. for stations x years) VIII 2,020.5396 (12 D.F. for stations x years) VII The sums of squares for the main effects and the first order interaction for the com- putation of the second order interaction are taken from table 8 opposite the appro- priate key number. The interaction of blocksx stations x years is obtained in a simi- lar manner. The complete analysis is now carried out in table 8, the error sum of squares being obtained as a remainder. Table 8. Complete analysis of variance Key No, Variation due to: D.F, Sums of Squares Mean Square F -Valued I II III IV Blocks Stations Years Varieties 2 3 1 28.6221 9,324.57^9 5,319.9541 965-7038 14.3110 3,103.1916 5,819-9541 241 . 4272 162.85** 304 . 92** .12.65** V VI VII viii IX X . XI XII Error Interaction of: Varieties x stations 12 Varieties x years 4 Varieties x stations x years 12 Stations x years 3 Blocks x stations 6 . Blocks x years 2 Blocks x stations x years 6 (Blocks x varieties 8) (Blocks x varieties x stations 24) (Blocks x varieties x years 8) 64 (Blocks x varieties x 3ta.x yrs.24) 1,293-4133 513.5472 2,020.5396 1,427-5142 1,242.8159 74.0012 360.6795 1,221.5546 107.7345 128.3868 168. 3783 475-8331 207.1360 37.0006 60.1132 19.0868 5.65** 6 . 73** 8.82** 24 . 95** 10 . 35** 1.94 3.15** XIII Total 119 24,292.9259 **Exceeds the 1 per cent point in Snedecor's table of "F". ^For comparison with error. 202 In practice, it is unnecessary to key-out the complete analysis as given in table 8. The variation due to blocks., blocks X stations, blocks x years, and blocks x stations x years should be grouped as one quantity, being designated as "Blocks within sta- tions and years" or "Blocks within tests" (for 16 D.F.). The reason for this is readily apparent. The blocks are numbered I, II, and III arbitrarily. Block I at University Farm has no relation to Block I at Wasoca or any other station. Thus, it is an error to regard blocks as a factor that occurs at several definite levels (3 in this case). The correct procedure is therefore to compute the block sums of. squares for each experiment and combine them to present in the final analysis. The analysis of variance may then be presented as in table 9- Table 9- Analysis of Variance (in summary form) Variation Sums ■ Mean due to : I) F. Squares Square F-Valu..> Blocks within Tests 16 1,706.1187 106.6324 5 . trqtt-H- Stations 3 9,324.5749 3,108.1916 162 . 85** Years 1 5,819.95^1 5,319.9541 30 4 . 92.** Varieties 4 965.7088 241.4272 12. 6 5** Interactions : Varieties x stations 12 1 ; 293- 4138 107 7845 5.65** Varieties x years k 515.5472 128.3868 6.75** Varieties x stations x years 12 2,020.5396 168.3783 3, 82** Stations x years 3 1,427.51^2 475,8381 24 . 93** Error 64 1 . 221 . 5546 19.0868 Total 119 24,292.9259 The analysis may be summarized still further in case the investigator is not inter- ested in the variation due to stations, years, and stations x years. He may group these factors into variation due to "tests within stations and years" or simply "tests" (for 7 B.F.). The variety factor and its interactions would be given in de- tail because it has definite biological significance. V . Sum s of Squar es in Sim ple vs . Complex Experiments It will be useful at this stage to relate the complete analysis in Table i simple randomized bloc]': tests computed for each of the 8 tests separately. with the The sums of squares for total, blocks, varieti .es, aid. error are siven in table 10 for the 8 tests. Table 10 . Sums f squares calculated from the tests 8' iparately Sum of squares Sum of squares Sum of squares Sum of squares Test Year for total for blocks for varieties for error U. Farm - 1932 867.2973 450 . 90 73 3 75. 61 06 41.5894 U. Farm - 1935 1031.7133 ' 251 • 6253 506 . 3600 273 . 7280 Waseca - 1932 I907.I36O 471 . 3960 1196.3293 239-4107 Waseca - 1935 1203.5600 131.1480 993.8400 78.5720 Crooks ton- 1932 487.8733 80 . 8333 .. 252.6266 154.4134 Crookston- 1935 807.3360 49.2040 595-3227 162.3093 G. Rapids- 1932 905.5573 179.6353 536.1640 189.7080 G.' Rapids- 1935 509.9093 92.1293 336.4560 31,3240 Total 7,720,8825 1,706.1185 . 4,793.2092 1,221,5548 ■ 203 It is noted that the sums of squares for error for the 8 separate tests adds to 1,221-5548. This agrees with the sum of squares for error of oable 8 (1,221-5546) the discrepancy "being due to dropping of decimals. There were 8 D.F. for error in each separate test or 8 by 8 = 64 in all tests. These same 64 D.F. were used for error in the complete analysis in table 8. The error used in table 8 is, therefore, simply the average error of the separate tests. The sums of squares added for blocks, in the 8 tests, gives 1,706.1185 (see table 10) Addition of the sums of squares for blocks, blocks x stations, blocks x years, and blocks x stations x years from table 8 gives a total of 1,706.1187, which agrees al- so. Further comparisons are given in table 11. Table 11. Comparison of degrees of freedom and sums of squares of the 8 separate tests with the complete analysis of table 8. Variation due to: Calculated from 8 separate tests Calculated from the complete analysis D.F. Sum of sq. D.F. Sura of Sq. Key to table 8 Blocks Varieties Error (8)(2) (8)00 (8)(8) = 16 = 32 = 64 1,706.1135 4,793.2092 1,221.5548 16 52 64 1,706.1187 4,795.2094 1 , 221 . 5546 I, IX, X, XI IV, V, VI, VII XII Total (8) (HO =112 7,720.8825 112 7,720.8827 I,IV,V,VI,VII,IX, X,X1,XII From the above table the analogy between the separate analyses of variance for each test and the complete analysis is clear. The 112 D.F. for total in table 11 is the total sums of squares within test s. When the 7 D.F. between tests (i.e., stations = 5, years = 1 and stations x years = 5) are added, the full 119 D.F. is obtained. The same is true for the sums of squares. VI. Interpretation of the Data The manner in which the data can be interpreted will now be illustrated. From table 8 it is 3een that the variance (mean square) due to varieties compared with variance due to error exceeds the 1 per cent point. Therefore, some of the varietal differ- ences are significant. The mean yields for varieties, computed from Table 1, are given in table 12. 20k Table 12. Mean yields of 3 plots of each variety, average yields for both years and average yields of varieties for all tests. Year 1932 1933 Manchuria 26.9 31.9 Variety Glabron Velvet Wis. #38 Unive r s ity Farm ~ 36.8 26.8 _ 46*0 37.0 38.0 63.3 Peatland >1.9 Mean yield 39^ 41.4 4 1.9 50.7 40.0 1932 Mean Yield 33-5 53.5 43. 5 Waseca 37-7 4875 37-^ 56 . 5 46.9 2D .0 54.2 U7.H L.'-- 53. 1932 1935 Mean Yield 35.0 4o.o 36 5 Croo ks ton 26.2 * 32.9 29-5 32.1 40. 6 36 IT 25, 35- 35-9 ,_52,0_ 44.0 1932 1935 Mean Yi 22.1 cJ. O 25.4 Grand Rapid s 14.4 21. 4 17-9 JC. c. 26.0 ?9.1 20.7 25-1 22.9 26. o 31.2 Mean for all stations ~*>6.2 ^ 58.6 40 . 7 The varianc e due t o error was 1 9-0868 (table 8). The standard error of a single plot would beVl9<0868 = 4.369 bu. Since 24 plots are involved in the variety averages Qno DU , for all stations, the standard error of the mean of 24 -plots is 4.36 9 = 0.89 The standard difference between two such means would be 0. 892V 2 = 1.26 bu. With 64 D.F. for error one may accept twice the standard error of the difference as a level for odds of approximately 10:1 against the chance occurrence of a difference of (2)(1.26) - 2.52 bu. Since the mean yield of Peatland for all stations and years was 42.0 bu., any variety that differs from it by more than 2.5 bu. may be judged as probably significantly lower in yield, on the basis of these tests alone. On this basis Manchuria, Glabron and Velvet are significantly lower in yield than Peatland. The interaction of varieties x stations was also significant (table 8). A first order interaction is essentially a difference between two differences. The mean yield of Peatland at University Farm, for an average of both years, was 10. 7 bushels less than the yield of Wisconsin #38 (50,7 - 40.0 = 10.7). The mean yield of Peatland at Grand Rapids exceeded the yield of Wisconsin #38 by 3.3 bu. (31.2 - 22.9 - 8.3). The ques- tion then is whether these two differences art. significantly different. This differ- ence between two differences will be given by Wisconsin #38 minus Peatland at Univer- sity Farm less Wisconsin #38 minus Peatland at Grand Rapids, or (50, 7 - 4-0,0) - (22 .9 - 31.2) = I9.O bu. The standard error of this "cross difference " will be / l9.086S x g"x 2 = 3.367. V 6 MB 205 I t may als o be computed as follows: The standard error of the mean (o~f ) is equal to /19.0868 = 1.784 bu. , since 6 plots are contained in each mean. The standard error of the difference between two differences then is 1.784 V2'VF= 3-567 bu. Twice this is 7.13 bu. and any "cross difference" that exceeds this value is expected to occur less than once in 20 trials by random sampling alone. The cross differences for Peatland and Wisconsin #38 at University Farm and Grand Rapids greatly exceed 7*13 bu., being 19-0 bu. It is clear, therefore, that these two varieties responded in a differential manner at University Farm and Grand Rapids as an average of 1932 and 1935- Other significant cross differences could be found in the same way. Significant interactions of varieties x years could be determined by application of the general procedure outlined above. Since only two years are involved these inter- actions of varieties x years can have very little practical significance. While the second order interaction of varieties x stations x years was also, signifi- cant, it is of secondary interest. This significant second order interaction merely means that certain differential responses of varieties x stations were not constant in different years. To illustrate the types of comparisons which must be made to show this, take th^ means of Glabron and Velvet at University Farm in 1932 and 1935 separately and the same yields at Grand Rapids. Then: f (56. 8 - 26.8) - (46.0 - 5"J0)]- £(l4.4 - 32.2) - (21.4 - 26.0) ] = 34.2 with an error of / J9-OQ68 x 2 x 2 x 2 or 7.13 bu. V " 3 Since the difference of 34.2 exceeds (2)(7.13) = 14.26 bu. it is obviously significant. For a complete understanding of a complex analysis, of which that given in table 8 is an example, one further comparison can bu made. Suppose that V, S and Y are designat- ed to represent variance due to varieties, stations and years and V x S, V x. Y and V x S x Y the interaction variances. Then one may determine whether the variance due to: V > V x S > V x S x Y > Error V > V x Y > V x 3 x Y by means of the "F" test. When the variance due to varieties significantly exceeds the interaction, varieties x stations, there is evidence that varietal performance generally was consistent enough to demonstrate that some varieties were the bust in all stations, as an average of the years in which tests were made. When the variety variance significantly exceeds that of varieties x years, one may conclude that, as an average of all stations, some varieties were consistently better in yield in the different years. Further, when the interaction of varieties x stations significantly exceeds varieties x stations x years, it is plain that the differential response of the varieties at the separate stations were sufficiently similar in the different years to warrant the conclusion that these d.ifferential responses may be permanent features of these local- ities. Unless the variance for varieties significantly exceeds that of varieties x stations or varieties -x years, no general recommendations can be made for the entire state or for future years. To make such recommendations the stations (of which tests were made) are considered as random samples of all places 1n the state and. the years in which tests were conducted must be considered as a random sample of all future y^ars. It is only when the number of stations and. years can be considered an adequate sample of all possible places and years that worthwhile predictions can be made for all places in the state and for future years. (Sec Summerby, 1937). • 206 VII. The Homogeneity Test The question may be raised as to whether the data afforded by the several experiments are sufficiently alike to assume that they may have resulted from a single' population. In case this is true, the data from the experiments may he consolidated and. analyzed, as one complex experiment. Homogeneity tests have been suggested by Sne decor (1937) and by Stevens (1936). The formula may be explained as follows: Let n - the number of experiments. n - 1 = the number of degrees of freedom between experiments. M = the number of degrees of freedom for error within an experiment. e :- the sum of squares for error in a single experiment. v = the observed, variance V - the theoretical variance L = the Lexis ratio The observed variance of the sums of squares due to error for all experiments is: ▼ = S(e - e) 2 = S(e 2 ) - (Se) 2 /n n - 1 n - 1 The theoretical variance, where the total., S(e), is assumed, to be the population of error sums of squares, is as follows: 2M 5(e) rp or 2_ " Sje)' M (n-l_)j m _ n . o The Lexis ratio (L ) is the ratio of the observed to the theoretical standard error, so that its square is L 2 - v/V. When the ratio is greater than one. the series of sums of squares due to error is called supernormal. When "L" is less than one,, it is called subnormal . A certain degree of supernormal ity or subnormal! ty can be attributed to cnance. The limits for significance can be determined by the X, 2 test, viz., X 2 = (n-l^L 2 or (n-l) (v) (vO When X 2 corresponds to a probability of less than 0.05, the series is too supernormal to admit that they resulted from, a single population. AX 2 that corresponds to a probability greater than 0.°5 indicates that the series is too subnormal : f 'or consoli- dation of the data. The homogeneity test can be applied to the data on' the barley yield trials as compiled. in the separate tests in table 10 : Test Year Gums squares due to error U. Farm 1932 '+1.5894 U. Farm 1935 . 275.72a') Waseca 1QJ2 239A107 Waseca 1935 78. 5720 Crooks ton 1932 . i5JJ-.il 3** Crookston 1935 162.8093 Grand Lap ids 1932 I89.708O Grand Rapids 1935 • 81.32^0 n - 8, (n-1) = 7, M = 8. S(e) 1221. 55^8 nH 207 The sinus of squares for the eight "error sums of squares" is as follows: S(e 2 ) = S [(41. 589M 2 + (273-7280)2 + (81.3240) 2 ] = 233,100.8233 (Se) 2 /n = l86,5?4.5773 S(e - s) 2 = S(e 2 ) - (Se) 2 /n = 46,576.2460 v = S(e - 5) = 46,576.2460 = 6,653-7494 n-1 7 V = 2M S(e)" nM = 16 1221.5548 L (B)(8) . 2 = 16 (19.0868) 2 = (16) (364.3059) = 5828.8944 X 2 = (n-l) f v) = (7) (6655.7494) = 7. 9906' V 5828.8944 When theX2 table is entered for 7 degrees of freedom, P = 0.3335- Therefore, the data are sufficiently homogenous to permit the calculation of one generalized standard error for all tests. VIII. Transformation of Percentage Data Some types of discrete data cannot "be comMned to provide a valid estimate of a generalized standard error. This applies particularly to some forms of percentage data wherein each variate represents a certain number of observations of a given type or condition out of a total number of trials or cases (N). The variance of a single variate of this type is pqN. It is clearly dependent upon p, the estimated ratio of existence of the type or condition in question, as well as upon N. Bliss (1937), Salmon (1938), Cochran (1938), Clark and Leonard (1939), and others have recognized that each variate in discrete data of this kind does not have the same opportunity to contribute equally to a general experimental error. (a) The Angular Transformation R. A. Fisher has supplied a mathematical transformation for such data which will equalize the estimated variance of each variate so that it is functionally dependent only on N, the total number of trials. In this transformation, each estimate of p is replaced by sin 2 -Q- whence, -0-= Sin "7p or 1/2 Cos" 1 (l-2p) This transformation must be applied to discrete data of this type so that the analy- sis of variance may be valid. However, it is of little practical importance when the percentage values are between 30 and 70. Bliss (1937) has compiled a convenient table for the transformation of percentage values to angles, the latter being measured in degrees (See Table 5, appendix). (b) Classification of Percentage Data The type of discrete data, rather than its expression in percentages, determines whether or not the transformation should be employed. The types of percentage data are classified by Clark and Leonard (1939) as follows: (1) Continuous data from an experimental study may be expressed as percentages when each variate is divided by an arbitrary constant value, whereby each variate becomes a percentage of some standard '<& ■ a or average. Clearly such a procedure merely transforms the unit of measurement. Per- centages of this type should be treated statistically exactly as though the data were in their raw form. For example., yield data might "be expressed in percentage of the check instead of actual yield in pounds. (2) Continuous data are often expressed in percentages to show concentrations. This type of percentage is very common. Some examples are: seed purity given by weight of pure seed/ total weight of seed, leafi- ness given by leaf weight/ total plant weight, protein content given by weight of protein/ total weight; sugar content given by weight of sugar /weight of root. etc. Such concentrations should not, as a rule, be subjected to any transformation to equalize the variance. (3) The third type of percentage is where the original data are discrete, being based upon a determinate number of trials or cases (N). The transformation, p = sin^fr should be applied to this type where it is desired to con- struct a generalized standard error. Illustrations of this type are as follows: Germination percentages given by number of seeds germinated/ total seeds, disease percentages given by number of plants diseased/ total plants, etc. Referenc es 1. Bliss, C. I. The Analysis of Field Experimental Data Expressed as Percentages. Plant Protection Bui. 12, U.S. 3. P. (Leningrad). 1937- 2. Clark, Andrew, and Leonard, Warren H. The Analysis of Variance with Special Reference to Lata Expressed as Percentages. Jour. Am. Soc. Agron., 31:55-66. 1939^ 3- Cochran, W. G. Some Difficulties in the Statistical Analysis of Replicated Experiments. Emp. Jour. Exp. Agr. , 6:157-175- 1933. 4. Fisher, R. A. Design of Experiments. Oliver and- Boyd. 2nd cEd. pp. 75-76- 1937- 5> Goulden. C. H. Methods of Statistical Analysis. John Wiley, p. 139- 1939- 6. Immer, F. R. , Hayes, H. EC. and Powers, L. R. Statistical Determination of Barley Varietal Adaptation. Jour. Am. Soc Agron., 26:403-419, 1934. 7. Paterson, D. D. Statistical Technique in Agricultural Research. McGraw-Hill- pp. 58-65, 66-67, and 205-208, 1.939- 8. Salmon, S. C. Generalized Standard Errors for Evaluating Bunt Experiments for Wheat, Jour. Am. Soc. Agron.. 30:647-663. 1938. 9- Snedecor, G. W. Statistical Methods. Collegiate Press, Inc. pp. 196-197. 1937 10. Stevens, W. L. Heterogeneity of a Set of Variances. Jour. Gen., 33:398-399. 1936. 11. Summerby, R. The Use of the Analysis of Variance in Soil and Fertilizer Experi- ments with Particular Reference to Interactions. Scl. Agr., 17:302-311. 1937. Questi ons f or Dis cus sion 1. What are the advantages of a complex experiment over separate single tests? 2. As a matter of design, would it be necessary to have all varieties in all locations in each year? Why? 3- Why does the total sum of squares for the simple tests fall short of the total sum of squares for the complex experiment? What would make them check? 4. How would you interpret a significant interaction such as, for example, varieties x stations? 5- Explain why a first order interaction is essentially a difference between two differences. 6. What is a homogeneity test? Why should it be made? 7- Under what conditions should percentage data be transformed to degrees of an angle to admit valid use of a pooled estimate of error? 209 Problems 1. The yields in "bushels per acre for five spring wheat varieties tested in 3 rando- mized "blocks for 3 years are given below. (Data from F. R. Immer). Block Number I II III Tot. Block Number Block Number I II III Grand Variety I II III Tot. Tot. Tot. Thatcher Ceres Reward Marquis Hcpe U. Farm - 1951 17.0 20.0 19.7 56.7 16.1 18.9 20.5 55.5 21.1 25.1 21.8 66.0 15.4 20.9 18.4 54.7 20.3 21.0 14.2 55.5 U. Farm - 1932 33.6 37.7 31.2 102.5 29.7 30.0 55.9 95.6 24.1 26.9 29.8 80.8 26.3 31.3 29.8 87.9 28.1 25.4 51.5 85.O U. Farm 32.4 3^.3 37-3 20.2 27.5 25.9 29.2 27.8 30.2 12.8 12.3 14.8 21.7 24.5 23.4 - 1955 104.0 263-2 73-6 224.5 87.2 234.0 39-9 182.5 69.6 210.1 Total 89.9 103-9 Waseca - 26.8 35-6 29-2 32.4 23.3 22.8 26.2 28.8 22.6 21.0 94.4 288.2 1931 26.4 86.8 26.0 87.6 18.5 64.6 25.5 78.3 24.2 67.8 141.8 151.8 153.2 451.8 116.5 126.41516 574.51114.3 Waseca - 1932 Waseca - 1935' Thatcher Ceres Reward Marquis Hope 22.3 20.7 25.5 67.0 24.3 26.2 26.7 77-2 27.2 24.9 24.6 76.7 27.8 26.5 24.0 73.1 24.0 25.7 23.3 71.0 28.5 30.1 30.1 15.6 14.5 14.5 13.0 22.4 25.2 15.4 6.4 4.9 23.0 29.0 25.5 88.7 242.5 44.4 209.2 •65.6 206. 9 24.7 181.1 82.3 221.1 Total 128.1 138.6 113.4 585.I 126.1 121.8 122.1 570.0 105.5 102.4 99-8 505-7 1060.8 Thatcher Ceres Reward Marquis Hope 59-0 54.6 52.5 51.4 Crookston - 1951 30.4 lOf.O 37-6 37- 4^ 31.3 26.4 33-7 105.7 29.3 93-6 30.5 88.3 Crookston - 1932 Crookston 25.1 15.2 51.1 19.5 23.1 22.8 20.1 19.2 20.8 20.9 19.8 15.5 59-1 71.5 65-7 54.8 27.0 15.4 16.8 5-* 24.2 17.5 11.0 11". 5 16.4 14.6 5.9 8.4 1935 ~£o\7 234.8 57-9 215.1 47.8 207.1 17.7 160.8 27.8 50.6 29.4 87.3 19.5 25.2 20.8 65.5 18.0 18.5 15-0 51.5 204.6 Total 165- 5 165-3 153-8 482.4 116. 9 101. 9 97-3 316.6 32.6 75.8 67.0 225-410224 Total 383.5 405.8 566.6 1155.7 584.3 575-5 573.11153.4 502.4 502.6 29814 905. 4 5197-5 (a) Calculate the analysis of variance for the complete study. (b) Test the significance of the different mean squares compared with the error variance; using the F test. (c) Compare Thatcher with Ceres, as an average of all tests, using the standard error of the difference. (d) What would be the standard error for testing the significance of the interaction between Thatcher and Ceres in 1952 and 1935> as an average of all stations? Make the proper test of significance. (e) Do the same, as under (d), for comparing Thatcher and Ceres at University Farm and Waseca, as an average of all years. Is this interaction signigicant? (f) Calculate the sums of squares for blocks,, varieties, error and total for each of the 9 separate tests and add the different components for all 9 tests. Compare these sums of squares with appropriate combinations in the complete analysis of variance table. 210 2. Test the data in problem 1 for homogeneity. 3- The relative infection in different varieties for 5 bunt collections were as fol> lows (Data from Salmon . 1^38): Variotv Bunt Oro Rid it Alb it Turkey Collection (l)^ (g) [l] _[2j_ £U (2) (l) (?) (Pet.) (Pet.) (Pet.) (Pet.) (Pet.) (Pet.) (Pet.) (Pet.) 1 0.0 0.9 6.3 3.9 0.0 0.0 8.3 6.5 2. 2.5 3.6 8.7 2.2 92. U 90.5 89.O 8'4.3 3 1.5 0.0 6.0 0.7 93.7 90.1 3-3 6.0 k 1.5 6.3 k.l 3.1 1U.0 lj.5 81.7 87.2 5 0.6 1.7 3.9 3.6 U,2 3-2 7,5 2.U (a) Transform those percentage data to degrees of an angle and compute the analysis of variance for varieties, replicates, and collections. (b) Compute the data without the transformation and compare the results with (a). vThese numbers refer to replicates, ' CHAPTER XVIII THE SPLIT -PLOT EXPERIMENT ^ ". I. Use of Split Plot Experiments Sometimes it is an advantage to use relatively large plots for one series of treat- ments and sub-divide these whole plots into a number of sub-plots to superimpose a second series of treatments. This type of design, called the split-plot experiment, was first proposed by Yates (1933.* 1935)- It is particularly useful in spacing tests -with crop plants, some fertilizer trials, and in cultural studies. Le Clerg (1937) used this type of experimental design to ascertain the effect of 5 fertilizer mixtures (main treatments) on the seedling stand in plots sown with treated and un- treated seed (sub -treatments) . Goulden (1939) gives a more complicated split-plot design in which he studied the incidence of root-rot on wheat varieties, kinds of dust for seed treatment, method of dust application, and efficacy of soil- inoculation with the root rot organism. The split -plot design provides a more critical comparison of the sub-plot treatments than it does for the whole-plot units. This is due to the larger number of replica- tions of the small units which, in turn, provide a larger number of degrees of free- dom fo r error . Paterson (1939) advises that the less important treatment effects be allocated to the whole plots and the more important treatment effects to the sub- plots in order to obtain the maximum precision where it is most desired. The split-plot design leads to two or more errors. To simplify the computation, all treatment values should be expressed in sub-plot units. II. Data usca for Computation Two designs are outlined below together with the method of computation. These data are from a corn uniformity trial conducted at Waseca (Minnesota) in 1933 by C. \J . Doxtator. They are for yield in pounds for the central two rows of four-row plots 12 hills long. For purposes of calculation, it was supposed that these data were ob- tained from 10 hybrids which are designated 1,2,3 10. It was supposed further that these varietal plots had been split into three parts to test the yield of those crosses obtained from F^, F 2 , and F* generation seed. These are designated a,b, and c, respectively. The yields in the tables that follow are in the same order as in the field. The hypothetical hybrids and generations were superimposed on the data by random arrangement , . '" III. Sub -treatments Randomized withi n Main Plot (Plan A) The field arrangement of the plots is given below. The 10 hybrids are assumed to have been planted in rows of 36 hills, using F-j seed for 12 hills, F 2 seed for 12 hills and F^ seed for the remaining 12 hills. The order of the hybrids in the field is random and the three generations of seed for each hybrid are planted in a random order within each long. row. Ha r brid Number 3 n 2 1 . 6 7 10 9 k 5 c a' a c b c b c b c a b c b a b c a a b b c b . a c a a b c i'This chapter is a modification of one prepared by Dr. F. P. Immer, University of Minnesota, for his Applied Statistics course. -211- 212 The. yields of each plot are given "below in table 1. Data from two "blocks are used to illustrate the calculations. Table 1. Yield of corn per 12 hill plot and sums of yield o f 36 hill plots. Block I Hybrid Number Total 3 8 2 1 6 7 10 9 >± _JL. a 48 c i+6 a 1+6 a 1+2 c b 43 )+7 c 1+8 b 46 c 1+6 h 1+9 c a b c b a b c a a 1+6 ^5 1+1+ 1+6 45 49 45 48 43 49 b b c b a a a b c ^3 42 1+2 44 1+1+ 47 45 47 47 48 T. 137 133 132 132 132 11+3 138 141 141 146 1373 b: .ock II Hybri .d Number Total 4 3 9 .? 1 7' 2 ' 10 c a a b b c c a b c 1+6 45 46 1+5 *3 48 44 44 hi 1+3 a b c a c b c a b 1+8 1+1+ U6 1+5 50 48 46 48 ^3 b c b c a a b b c a 1+2 1+2 It 4 ^3 1+4 48 47 46 44 42 T.136 13.1 136 133 137 147 136 139 ,28 1362 ( a ) Calculation of Sums of S qiiar es The analysis of variance is given in table 2 Table 2„ Analysis of Variance Variation due to: Blocks Hybrids Error (a) D.F. 1 9 9 oums of Squares 2.8166 77.6833 8I.OI67 Mean Square 2.8166 8.63I5 9.0019 Plots of hybrids :i9) I6I.5166 7.2533 88.7667 40.6667 298.I833 Generations Hybrids x generations Error (b) Total 2 13 20 59 3.6167 4.9315 2.0333 The total sum of squares is calculated from the squares of the 60 individual plot- yields as S( x 2 ) - (Sx) 2 /N which, numerically is 125,151.0000 - 124,852.81.67 = 298.1833 (59 D.F.). The sum of squares for blocks is I375 2 + I562 2 - 124, 8o2.8lo7 = 2 t 8l66 (1 D.F.) 30 213 Sum of squares for total plots of hybrids is calculated from the marginal totals for hybrids in the above table. Thus, 157 2 + 135 2 j 128 2 -12i<-,852.8l67 = 161.5166 (19 D.F.) 3 To obtain the sums of squares for hybrids, generations and. the interaction between them it is necessary to set up another table with the yields of the two replicates of each treatment combined. Generation: Hybrid Number Sum 1 2 3 if 5 6 7 8 9 10 a 86 9± 93 96 9^ 92 97 89 93 87 921 b 87 91 87 89 9^ 92 98 88 90 88 90^ c 96 86 88 92 91 87 95 92 9k 91 912 Sum 269 271 268 277 279 271 290 269 277 266 2757 Sum of squares for hybrids will be: . 269 2 ± 271 2 + 266 2 -12it,852.8l67 = 77.6833 (9 D.F.) 6 Sum of squares for generations is obtained from 92 1 2 + 9042 + 9122 _ I2i+, 852. 8167 = 7.2333 (2 D.F.) 20 The sum of squares of the 30 yields in the above table will be equal to 86 -^ 9^ * -- — 91 2 - 12i+,852.8l67 or 173.6833 (29 D.F.) 2 ' Sum of squares for interaction of hybrids % generations will be: 173.6833 (29 D.F.) -7.2333 ( 2 D.F.) for generations -77-6333 ( 9. D.F.) for hybrids 88.7667 (18 D.F.) = sum of squares for interaction (b) Errors to Test Significance These sums of squares are brought together in table 2. The sum of squares for error (a) are obtained by subtracting the sums of squares for blocks (1 D.F.) and hybrids (9 D.F.) from "plots of hybrids" (19 D.F..). The sum of squares for error (b) is obtained by subtracting "plots of hybrids", generations and hybrids x genera- tions from the total. Error. (a) is an ordinary randomized block error and may be used to test the significance of differences between hybrids . Error (b) is obtained from the sum of the interactions between generations and blocks within hybrids. Thus, a table could be arranged for the data from hybrid No. 3 (see table 1) as follows: Block I Block II Generation a b c lf8 k 1 ? ^3 kk k6 k2 21k The interaction of blocks x generations, for 2 D.F., could "be used as error for this simple comparison. However, a table similar to the above could be set up for each hybrid. There -would be, then, 10 x 2 = 20 degrees of freedom for error. This is what is used as error (b) . The mean square for error (b) is, then, the average error of blocks x generat ions wibhin hybrids . It will be the legitimate error to use for comparing differences between generations and testing the interaction of hybrids x generations. In practice this sum of squares is obtained by subtraction. IV . Sub -treatments in Str dps Across Blocks ( Plan B ) The same yield figures are used in this plan as in Plan A. The location of the hybrids is also the same. Instead of randomizing the generations within the 1 plots for each hybrid as in Plan A, the generations are now considered planted in long strips crosswise of the entire block. However, randomization of the generations in the different blocks is used. The field plan is given below. Table 3. Yields of corn plots and the field arrangeme nt of these plots. Block I Hybrid Numbt iV Total 3 B 2 1 _ _£. .... 7 10 Q k c. a 1*8 a 1*6 a 1*6 a 1+2 a h5 a kl a kQ a k6 a J+6 a i* 9 I+61 c 1*6 c 1*5 c kk c 1*6 c 1*5 c kg 1 s c 1*5 kQ c 1*3 c 1*9 h63 b 43 b 1*2 b 1+2 b kk b kk b ^7 b b kl b b 1+8 kk') Tot.137 133 132 132 132 lk$ 138 l'+l li+l ii*6 Block II Hybrid Number Total k 3 9 5 1 7 2 8 6 10 b b b b b b b b b b 1*6 1*5 1*6 45 1*3 1*8 li-1* 1*1+ 1+7 k'5 1*51 c ccccc ccc c 1*8 44 46 1*5 50' 51 1*8 46 1*8 1+3 469 a a a a • a a a a a a 1+2 k2 kk 1+5 1*4 1*8 1*7 1+6 1*1* 1*2 1*1*2 T0U.36 131 136 I33 137 l)+7 139 136 139 128 1362 The some plots are used here as in table 1. The hybrids occur ;ln the same order as in the previous table, the only difference being the arrangement of the generations. In table 3 the generations occur in strips crosswise of the blocks. 215 Table 4. Analysis of variance from the data of table 3 Key number for D ,F . and sums of squares Variation, due • to D.F. Sum of Square s Mean Square 1 2 3 = 4-1-2 . E locks Hybrids Error (a) 1 9 9 2.8166 77-6833 81.OI67 2.8166 8.6315 9.0019 = 7-1-5 Plots of hybrids Blocks Generat ions Error (b) 19 161.5166 1 2.8166 2.8166 2 2 35.4333 16.2334 17.7167 8.1167 5 54.4833 19 1+ 161.5166 51.6667 18 18 61.5667 23.1+333 3.4204 1.3019 8=5+6 10 = 11-9-8J* Plots of "generations" Plots of hybrids Deviation of genera- ) tion plots from blocks) Hybrids x generations Error (c) 11 Total 59 298.1835 The total sum of squares (298.1835)* sum of squares for blocks, hybrids, error (a) and total plots of crosses will be the same as under Plan A. The position of the "generation" plots has "been changed, however, and the other sums of squares must be recalculated. As far as 'the test of the three generations a, b and c is concerned there are but . six plots as given in the marginal total of table 3. The sum of squares for those six "plots of generations" will be l+6l 2 4--I.65 2 ± 1+1+ 9 2 + 451 2 ± 469 2 ± 442 2 - 12l+ , 852 . 8167 = 54.4833(5 D.F.) 10 To obtain the sum of squares for generations and for interaction, a table is made up by combining the two yields of each treatment . Genera- tion IT Hybrid Number 5 ~8~ "b c 86 87 96 93 86 92 90 88 90 8b 93 96 92 93 9l+ Tot. 269 271 263 277 070 79 271 290 269 277 10 Tot. 87 95 92 90 90 903 91 95 86 93 83 900 93 100 91 91+ 88 934 266 2737 The sum of squares for generations will be 35-4333 (2 D.F.) 9002 9542 . 12^,852.8167 - 20 The yield figures in the above table are squared, i.e., So 2 + 93^ + 88 . The sum is divided by 2, the correction factor then being subtracted. This gives 174.6833 as the sum of squares for bhese 29 degrees of freedom. The sum of squares for the interaction, hybrids x generations will he: I7I+.6833 - 77.6833 - 35.4333 = 61.5667 (18 D.F.). 216 In table k it is noted that the comparison of hybrids is the same as under Plan A. Error (b) will he obtained by the subtraction of blocks and generations from the "plots of generations." It is seen from table 3 that the analysis of variance to test the significance of generations involves only 6 large plots. The total yields of these could be set down from the marginal totals of table 3 as: Generations a b c Total Block I k6l Block II hh2 M+ 9 1+65 k'jl 469 1375 1362 Total 903 900 93^ 2737 An analysis of variance of this 2-by-3 table would give the second section in the complete anal.ysis of table h. Error (b) is appropriate for the comparison of dif- ferences between generations. Error (c) is obtained by subtraction from the total the items listed in table k opposite error (c). Error (c) is the second order interaction of blocks x hybrids x generations., the degrees of freedom being 1 by 9 by 2 = 18. It was obtained in table k by subtraction but has the above meaning. Error (c) is appropriate for testing the significance of the interaction of hybrids x generations. Since these were uniformity trial data no attempt will be made to determine signifi- cance of the different mean squares. In a practical experiment these tests are carried out in the ordinary way, the appropriate errors given in the tables being used . Yates (1933) has discussed the above two designs rather fully. Y. Comparison of Two Designs Suppose the 10 hybrids and 3 generations of the seed of each (F]_ ., Fg and F*) had been considered as simply 30 treatments and completely randomized within the blocks without reference to split plot arrangements. The analysis of the data would have taken the form: Variation due to: D.F. Blocks 1 Hybrids 9 Generations 2 Hybrids x generations 18 Error 29 Total 59 The degrees of freedom for error given above (29) is equal to the sum of the degrees of freedom for errors (a) and (b) under Plan A and the sum of degrees of freedom for errors (a), (b) and (c) under Plan B. Plan 3 is the same as Plan A insofar as precision of tests of the hybrids is con- cerned. It differs from Plan A in that precision for the comparison of generations is sacrificed in order to obtain greater precision for the interaction. The design of an experiment will depend entirely on what element of the treatments the highest degree of precision is desired, When the primary emphasis is to be 217 placed on the Interactions, at the expense of higher errors for the main 'effects, Plan B Is to be preferred. When the main effects- are of major interest either the complete randomized block or Plan A are to be preferred. In practice the relative differences in magnitude of the different errors under Plans A and B will depend on the dimensions of the blocks . In this case the blocks •were 1+0 rows wide, or lAO feet, and the 36 hill rows of hybrids were 126 feet; long. Consequently the "generation" plots tended to be closer together than the most dis- tant hybrids, in the same block. Plan A is particularly applicable to studies of space relationship between plants in relation to yield. In a recent study of the effect of spacing on yield of soybeans, conducted^by the Division of Agronomy and Plant Genetics, U. of Minnesota, Plan A was found 00 be admirably suited to the test. The soybeans were planted in k row plots, the rows being 16, 20, 2k, 28, 32, and ho inches apart. Then, the soybeans were planted at k different rates within each spacing, being l/2, 1, 2, and 3 inches apart within rows . The only oa.sy way to lay out such a test was to plant the plots of different width rows crosswise of the regular 132-foot series. The k different rates of seeding were then randomized within these long rows, the ultimate plots, being 33 feet long. Plan A could be laid out as follows also, using the same notation as employed in table 1. Hybrid H umber 3 • 8 : 2 : etc . a : c : b : c : a : b : b : a : c : etc. Here the hybrids are planted in groups of 3 plots with the three generations in a random arrangement within each hybrid plot but they occur side by side instead of end to end. By this plan it is obvious that the comparisons between generations (a, b, and c) will have a lower error than comparisons between hybrids (1,2,3 ). The data from this arrangement would be analyzed exactly in the same manner as given under Plan A. VI. Randomized-Block vs. Split-Plot Experiments The relative efficiency of randomized-block and split-plot experiments was studied on uniformity trial data with sugar beets by Le Clerg (1937) both in the field and in the greenhouse. He compared the magnitude of the variance of the sub-plots with- in main plots in the split-plot design with the variance of sub-plots within blocks in the randomized block arrangement. The variance for sub-plots within main plots in the split -plot design was markedly less than that for sub-plots within blocks in the randomized arrangement. The split-plot design was 71 per cent more efficient in one set of uniformity data and 53 per cent more efficient In another. For com- parisons of main plots within blocks there was a decrease in efficiency by the use of the split-plot arrangement. Similar results were obtained for greenhouse trials, altho less marked. References 1. Goulden, C. H. Methods of Statistical Analysis. John Wiley, pp. 151-159- 1939- 2. LeClerg, E. L. Relative Efficiency of Randomized-Block and Split-Plot Designs of Experiments Concerned with Daraning-off Data for Sugar Beets'. Phytopath., 27:9^2 - 9I+5. 1937. 218 ! J. Paterson, D» D. Statistical Technique in Agricultural Research. McGraw-Hill. pp. 209-214. 1939. 4. Yates, Fi The Principles of Orthogonality and Confounding in Replicated Expert merits. Jour. Agr; Sci., 23:108-1^5, 1933. 3. Complex-Experiments. Jour. Roy. Stat. Sec. Suppl., 2:l8l-223. 1933. Questions for jD is cuss ion 1. What is a split -plot design? Where used to advantage? List at least 3 situa- tions . 2. Explain the differences in field lay-outs that lead to two and three errors. 3. Under what conditions would you use Plan "A"? Plan "B"? 4. Compare the relative efficiency of split-plot and randomized-block designs super -imposed on uniformity trial data. li'roplems The following data are from a randomized "block experiment with "split plots'' designed to test the differences in yield of soybeans planted at different spacings between and within rows. Four row plots were used, one row being harvested for hay and one for seed. (A) Yield of soybeans in "bushels per acre Block Width Block No. of rows 1/2" 1" 2" 3" Total Total I 16" 25.1 21.3 22.3 22.1 Q0.8 20" 21.8 22.7 22.2 22.8 89.3 24" 21.9 21.8 21.2 20.6 8.3.5 28" 21.2 20. 4 20. k 17.9 79.9 32" 20.7 20.0' 16.3 20.0 79.0 ko^ 19.3 18.3 i?V3 16.3 JIJS ^v^o II 16" 25.2 I9.9 22.1 22.7 89.9 20" 21.9 21.3 22.1 22.9 88.2 24" 19.7 I9.8 20.1 19.8 79.4 28" 20.8 21.2 18.8 20.6 81.1+ 32" 18.3 20.7 17,5 16. 4 73.1 4o" 18.3 18.2 1 9.8 15 .j2 72.4 484.4 III 16" 15.7 21.6 22.9 " 20.3 80.5 20" 22.0 20.4 22. 4 20.7 83-5 24" 23.5 20.7 20.7 20'. 5 87.4 28" 21.5 19.9 20.3 20.9 82.8 32" 22.0 ' 19.3 13.1 17.8 7 7 -2 4o^ 20_ ! 3 16.4 17 ^3 13.3 72-9 486.3 IV 16" ' 23.8 29.0 12.3 23.5 88.6 20" ■ 27.0 21.2 20.5 20.7 89.4 24" 23.3 20.0 22.5 19.8 33.6 28" 22.3 21.3 22,7 13.9 83.6 32" 23.9 18,4 20.7 18.7 31.7 4o» 19.9 17.8 16.9 18.3 73 - 1 -04,0 Total 322.6 491.8 479.8 476.8 1971.0 1971.0 219 (B) Yield of dry hay In tons per acre Block Width Block No. of rows 1/2" 1" 2" 3" Total Total I 16" 2,91 2.59 2.41 2.7^ 10.65 20" 2.96 2.35 2.31 2.10 9.72 24" 2.3^ 2.30 2.21 2.23 9.08 28" 2.59 2.47 2.16 2.10 9.32 32" 2.21 2.12 2.05 1.90 8.28 40" 2.24 1.90 1.82 1.79 7^75 54.80 II 16" 2.85 2.42 2.45 2.31 10.03 20" 2.42 2.48 2.31 2.27 9.48 24" 2.40 2.19 2.29 2.08 8.96 28" 2.48 2.22 2.30 2.08 9-08 32" 2.32 2.10 2.23 2.06 8.71 40" 2.34 2.07 1.76 1.78 7.95 54.21 III 16" 2.81 2.61 2.65 2.25 10.32 20" 2.66 2.52 2.78 2.52 10.48 24" 2.57 2.41 2.28 2.15 9-39 28" 2.03 2.22 2.39 2.01 8.65 32" 2.68 2.21 1.97 1.96 8.82 40" 2.13 2.09 1.84 1.96 8.02 55-68 IV 16" 2.83 3.10 2.12 2.38 10.43 20" 3.27 2.71 2.33 2.42 10.73 24" 2.71 2.31 2.22 1.97 9.21 28" 2.52 2.53 2.24 2.09 9.38 32" 2.37 2.29 2.20 1.85 8.71 4o" 2.10 2.13 1.92 2.06 8.21 56.67 Total 60.74 56.34 53.24 51.04 221.36 221.36 The actual field arrangement of plots in this experiment, in block number III was as follows: The plot arrangement in the other "blocks was randomized in a similar manner. Width of rows - 16" 32" 28" 40" 24" 20" 3" 1" 1/2" 2" 1. Analyze the data on yields of soybeans in bu. per acre. (a) Calculate the complete analysis of variance. Test the significance of the different mean squares, compared with the appropriate error variances, by means of the F test. (b) Determine the significance of the difference between 20" and 32" rows by means of the standard error. Spacing wi thin rows: 1/2" 1" 3" 2" 1" 3" 2" 1/2" y 1/2" a. 2" 2" 1" 1/2" 3" 1" 3" 2" 1/2" 220 (c) Determine the significance of the difference "between 1/2" and. 2" spacings by means of the standard error. 2. Analyze the data on yields of soybeans for tons of dry hay per acre in a similar manner. 3- Key out the degrees of freedom for a split plot experiment (two errors) for 3 spacings, 4 blocks, and 3 widths of rows. 4. A split-plot experiment was designed to determine the effect of seed treatment on the stand and ultimate yield of dryland corn planted at 3 different dates., a,, b, c. Each plot consisted of 2 sub-plots., the seed being treated (l) with an organic mercury compound in one -half,, and untreated (U) in the other half. There were 3 me/in plots in each block. All treatments were randomized. The field design of Block I was as follows: ! T ' U t T T ' U t U ' T i b c a i The yield data for the 6 blocks of the experiment were as fallows in bushels per acre: Date Seed Bl ock Planted Treatmi snt 1 2 •2 h 3 6 Total a U 2.3 4.6 3.4 2.3 5-8 3-3 24 . 2 T 2.3 4.7 4.2 3.6 5.0 4.6 24.9 b U h.3 M 3-3 6.1 4.5 4.0 26 . 5 T 5.1 6.1 3-1 4.3 5.3 5.9 29.8 c U 2.7 1.4 2.3 3-8 2.9 5-9 17.0 T 2 . .1.3 1.3 fc-7 3.4 1-5 15.2 (a) Calculate the complete analysis of variance. (b) Determine the significance between treated and untreated seed, and also between planting dates. CHAPTER Xn CONFOUNDING IN FACTORIAL EXPERIMENTS I. Factorial Experiments The randomized-block and Latin-square designs are widely used in field experiments, "both "being very efficient for simple studies. However, there are situations in ex- perimentation where a large number of varieties or treatments are to be compared at two or more levels. The factorial experiment is useful in such situations. Suppose that three fertilizers, Nitrogen (N), Phosphorus (P), and Potash ,(K) are to he test- ed at two or more levels. The classical method of approach would he to vary the two levels for each element only one at a time, i.e., the investigator would set up separate experiments to test each element alone at its respective level. The single factor could then he studied under controlled conditions at each of the two levels. To test these factors simultaneously in the same experiment, would permit one to study the effects of different amounts of one fertilizer on the others in all combi- nations. Thus, a wider "base of inductive reasoning is provided. The experimental argument is also strengthened by the larger total number of plots in the test. (See Fisher, 1955). Goulden (1937) describes a factorial experiment as one made to study simultaneously various treatment factors. Thus, an experiment designed to study at the same time rate and depth of seeding of a cereal crop would be a factorial experiment in which two factors, rate and depth, are represented at two or more levels. The study of interactions is an important consideration in such an experiment. The introduction of factors is limited by space and cost of experimentation. Suppose a fertilizer test is to be conducted \#ith II, P, and K at two different rates each. The rates can be designated by subscripts so as to give the eight possible- treatment variants as follows: •*- NqPoKo, NiPqKo, NoPiKo, NqPoKi, N1P3.K0, NiPqKi, NqPiKi, and NiP]Ki The degrees of freedom, i.e., the number of comparisons free to vary, may be keyed out as follows: Variation due to Degrees freedom Bemarks Nitrogen (N) Phosphorus (P) Potassium (K) N x P 2 N x K P x K N X P x K 1 ) 1 ) . 1 ) 1 ) 1 ) 1 ) Li Main effects First order interactions Second order interactions Total ■'•Note: The subscripts and 1 represent the two fertilizer levels. The symbol (x) denotes interaction and not a variable as heretofore. -221- 222 II . Data for Computation of Factorial Exper i men t The computations for the Analysis of Variance for such a factorial experiment will he illustrated with uniformity trial data. Four complete replications will he used. The uniformity data on crested wheatgrass were furnished by Dr. B. M. Weihing. The plots are combined as 8-l|-ow plots, 16 feet long, with rows 6 inches apart. Thus, each plot is k hy 1.6 feet in size. The yields are given in grams of air-dry field cured hay. The uniformity trial data follow. Tahle 1. Uniformity Data for Crested Wheatgrass Blocks Plot No. I II III IV (gnu) (gm.) (gnu) (gnu) 1 5135 3175 ^05 3750 2 I4-725 3980 1+575 3920 3 1+600 W+20 3910 1+175 •>+ 1+955 I+580 ^065 3280 5 3210 3970 3510 . 3190 6 3670 1+255 ] +305 3573 7 3735 3665 3993 3530 8 3965 1+315 I+030 2900 III. Computation as Sim ple Randomised Block Experiment The eight treatments will first he superimposed on the crested wheatgrass yield data for a randomized hlock test. - Tahle 2. Yields of Crested Vlieat gra ss in Eandoialzod Blocks Tre, atmont Eeplication N F K HP W£ PK NPK Totals I II III IV 3210 '3970 1+305 3530 U955 3175 1+1+05 ^175 5135 1+1+20 1+573 3920 1+600 3980 3910 3280 5963 3733 i+255 3665 1+030 3510 2900 3575 3670 ^-515 3995 3190 1+725 1+580 I4O65 3750 3I+OI+5 32560 32795 28320 Totals 15015 16710 18050 15770 15150 1.1+535 15170 17120 I27320 The sums of squares for "blocks, treatments, total, and error are computed in the ordinary manner. The Analysis of Variance can hu summarized as follows: Variation D.F. Sum Squares Mean Square "]?" Value due to Oh served 5 Pet . Point Blocks 3 2,303,556 767,832 3.91 3-07 Treatments 7 2,628,237 375,1+62 1.91 2.57 Error 21 k, 125,107 12^^'ik , Total 31 ~ '9,056", 900 "' The hlock effect removed is just enough to he statistically significant. Treatment effects are within 'the limits of error since the data are from, a uniformity trial. ■Htfote: The same randomization for treatments is used here as in the confounded experi- ment to he mentioned later. 223 The crested wheat yield data will now he considered from the standpoint of confound- ing. This process is expected to accomplish several things: (1) A greater amount of the variability due to soil heterogeneity should he removed because more and smaller blocks will be used; (2) A chance to examine the second order interaction, N x P x K, will be forfeited; and (3) The reduction of experimental error in this manner should sharpen all treatment and interaction comparisons. IV. Confounding in a 2 by 2 by 2 Experiment 1 A few terms must first be made clear before the analyses are made. (a) Explanation of Terms Every effort is made to maintain orthogonality in an experiment. Yates (1933) defines orthogonality as follows: "Orthogonality is that property of the de- sign which ensures that the different classes of effects shall be capable of direct and separate estimation without any entanglements." Thus, orthogonality is ensured in a randomized block experiment by the very nature of the design, i.e., each block contains the same kind and number of treatments. Non- orthogonality is introduced when some of the plots in one or more of the blocks are lost . Special methods may be required to separate treatment and block effects. Non-orthogonality is sometimes deliberately introduced in factorial experiments that involve a fairly large number of combinations . This process is called confounding. The purpose is to increase the accuracy of tie more important comparisons at the expense of the comparisons of lesser importance. (b) Confounding the Second -order Inte raction The second order interaction (N x P x K) in this experiment may be considered the least important. Certainly, it would be difficult to interpret in terms of fer- tilizer practice, even though it were significant. Suppose it is desired to con- found the one degree of freedom for this interaction with blocks. To accomplish this, it is necessary to determine the distribution of treatments in the blocks in a manner so as to confound this one treatment and, at the same time, leave the others intact . Algebraically, the treatment effects can be represented as follows: Nitrogen (N) = (*1 " N ) (Px + P )(K 1 + K ) Phosphorus (P) = <*1 -*oN»i + N )(% + K ) Potassium (K) = (Ki - K o)(% + N )(P 1 +P ) N x P (»1 - * )(P1 - P )(K X +K > N x K (>1 - N ) (% - K )(P 1 +-P ) P x K (Pi -?o)( K l - K Q )(N 1 + N Q ) N x P x K (Hi - HjfcPi - Pj(Ki - O The last expression can be expanded as follows: N x P x K = (N X - N Q )(P 1 - PjjXKx - K ) = (N X P K t I^A ♦ *£& - %?!%) - (XfJb - NiP^ + N^Kj. + N^K^ = (N + P + K + KPK) - (0 + UP + PK + NK) Then, the blocks could be divided as follows so as to confound the second order in- teraction with block effect: N P K NPK NP PK NK Sub -block A Sub -block B ^-For more eomplicated designs see Yates (1933), Fisher (1935) and Goulden (1937). 22*+ The contracts "between the two sub -blocks of each replicate will "be contrasts of the second order interaction (N]i?iK]_ and N P K ) . This interaction will have "been con- founded with "blocks. The sum of squares for the second order interaction will be given by: (See Goulden, 1957). 1/2 k ; (N + P + K + NPK) - (0 + NP + PK + NK)] where k - number of plots represented in each total. The above sum of squares will contain not only the second order interaction effect but also the block effect. In this ca.se, blocks of four plots each 'have been used for error control instead of blocks of eight (as would be the case in a simple randomized block experiment); and only the second order interaction has been lost. The key-out for four complete replications will be as follows: Variatio n d ue t o Dec r ees o f free dom Blocks 7 N 1 P 1 K 1 IxP 1 N x K 1 P X K 1 . error Total 1 u The treatments will be randomized in each sub-block. The field arrangement and plot yields follow: Table 3. Field Plan with Plot Locations and Yields S ub -b lock A Sub -block B Replication Treatment Yield Treatment Yield 1 N Pl K NlPlKi NpPoKl NiP K Total NlP K N P Kl NoPlKo 5155 J +725 ., 1+600 191+15 5175 3980 kkZO i4-58o NqPqKo NoPlKi KlPcICl N1P1K0 ■ 5210 3670 5735 396^ II Total N0P0K0 WlPlKo NiPqK] NoPlKi ]Jo30 3970 1+253 3665 ^315 III Total Nl?oKo N ?iK NoPoKl N1P1K1 16155 41+05 1,573 5910 M-065 Totai NiPoECi KqPoKo . NcPlKl NlPlK 16265 5510 H305 :>9?o 1+030 IT Total UlPlKi N-qPxKo NlP K NpPoKi 16955 5750 3920 ^175 3280 Total N ?lKi WlPpKl NcPbKo N1P1K0 15840 3190 3575 3530 2900 Total 15123 Total 13195 The yield data are summarize for main effects in Table h as follows 225 Table 4. Total Yields for Four Replications per Treatment Ko Kl Sum Ko Kl Sum No Po ?1 15,015 18,050 15,770 15,170 50,785 55,220 Hi *o 16,710 15,150 1^,555 17,120 31,21+5 52,270 Total 55,065 50,9^0 64,005 Total 51,860 51,655 65,515 The yields for the various interactions are totalled below in Table 5: Table 5. Total Yields for Interactions Comparison (a) N and K ( p o + Pl)No Nl Ko Totals Kl Sum 55,g65 51,860 50,940 31,655 64,005 65,515 Total 64,925 62,595 127,520 (N + N l> P o *1 Ko Kl Sum (b) P and K 51,725 55,200 50,505 52,290 62,050 65,490 Total 64,925 62,595 127,520 (*o + N l p o p l Sum (c) N and P 50,785 51,245 55,220 52,270 64,005 63,515 Total 62,050 65,490 127,520 The sums of squares for the experiment are given in table 6. The sum of squares for "blocks, N, P, K and total can be entered from table 6. The sum of squares for N x P is obtained hy the subtraction of 7505 + 374,112 (N + P) from 445,745 which is S(x|p). The result would be 62,128. The* sums of squares for N x K and for P x K are obtained in a similar manner. Table 6. Calculation of Sum of Squares Total Divide Corrected Symbol Tahle Sum Squares by (Sx) 2 /N Sum Squares D.F. S(x2) 5 517,224,100 1 508,167,200 9,056,900 51 S(x2) 5 515,954,112 4 503,167,200 5,786,912 7 S0§) 4 8,130,795,250 l6 508,167,200 7,505 1 S(x 2 ) 4 8,156,661,000 16 508,167,200 374,112. ' 1 S(x|) 4 • 8,133,589,650 16 508,167,200 169,653 ■ 1 S(x 2 ) NP 5 4,068,887,550 8 508,167,200 .443,743 1 S K 2 J HE s 4,067,676,450 8 503,167,200 292,356 1 S(x 2 ) PK 5 4,069,752,750 8 508,167,200 55.1,895 1 The analysis of variance, together with the obtained and theoretical "F" values are presented in Table 7. 226 Table 7- Analysis of Variance Variation D.F. Sum Mean Squares Square "F" Value due to Obtained 5 pet. point Blocks 7 5,786,912 826,702 5.87 2.66 N 1 7;503 7,503 18.76 2}+3.91 P 1 37^,112 37^,112 2.66 KM K 1 169,653 169,653 1.21 KM NxP 1 62,128 62,128 2.27 2V3.9I NxK 1 115,200 115,200 1.22 2U3.9I PxK 1 8,128 8,123 17.32 2U3.9I Error 13 2,553,2 64 L'iQ,757 Total 31 9,056,900" It is noted that the mean square for error has "been decreased materially in the con- founded experiment as compared, to that in the simple randomized "block experiment . In the former, the mean square is 1^0,737 while in the latter it is 196, K^K . It is also to he noted that more of the variability due to soil heterogeneity has been removed from the experimental error and drawn off in block effect which now appears as highly significant. The real value of confounding as a means to bring out more closely significant treat- ment effects and interactions is not evidenced in this illustration because uniform- ity data have been employed. The confounding design is purely artificial. V. Partial Confounding i n a 2 by 2 by 2 Experiment The above procedure resulted in the complete sacrifice of the second order inter- action, but it may be argued that the experimenter has taken too much for granted. He may overcome this difficulty by partial confounding, i.e., confounding different interactions in different replications. Goulden (1937) states that the results are used from the blocks in which the particular effects are not confounded in order to recover a portion of the information desired. The fertilizer test used as an example can be partially confounded and at the same time recover a portion of the information on all the comparisons. Four replications will be required for this purpose. In each replication, one degree of freedom can be confounded with blocks for one of the interactions. There are four interactions, viz., N x P, N x K, P x K, and N x ? x K. The algebraic relations stated previously can be used to determine the treatments to place in each sub-block to gain the desired effect. Bub -Blocks Interaction Algebraic Relationship A NxP = (N]_ - N )(P 1 - P )(K 1 - K ) = (N+P+MK+PE) - (0*HP+K+HPK) N x K = {Nil ~ N oH K l * ^( p l + ? o) c (N+K+KP+PE) - ( 0+P+HK+HEK) p x K = (Pi' - P )(Ki - *c)(Nl + N ) = (P4K+IIP+ME) - (0+N+PE+NPE) N x P x K = ( Nl _ N C )(P 1 . p^(K } - K ) = (N+P+K+NPK) - (O+NP+PK+NK) The treatments within each sub-block will be randomized. Table 8 gives the field design together with the plot yields in grams for the fertilizer trial superimposed on crested wheatgrass uniformity trial data. 227 Table 8. Field Arrangement and Yields In Partially Confounded Experiment Sub- -block A Sub- -block B Replication Treatment Yield (gm. ) Treatment Yield (gm.) I P 5135 3210 (N x P confounded) PK i+725 K 3670 NK 46oo NP 3785 • H Total ^955 19415 NPK Total 3965 14630 II N 3175 NK 3970 (NPK confounded) P 3980 NP 4255 K 1+420 3665 NPK U580 PK 4315 Total 16155 Total 16205 III (P x K confounded) NP 44o? NPK 3510 P 4575 N 4305 NK 3910 PK 3995 K 1+065 4030 Total 16955 Total 15840 IV (N x K confounded) N 3750 P 3190 PK 3920 NPK 3575 NP 4175 NK 3530 K 3230 2900 Total 15125 Grand Total = Total 127,520 13195 The treatment totals required for the computation of the sums of squares are arranged in Table 8 for the totals of the four blocks, and for the omission of each replica- tion. Table 9« Treatment Totals Required for Calculation of Sums of Squares Treat- All Minus Minus Minus Minus ment Replications Replication I Replication II Replication III Replication IV . I3805 10595 10140 9775 10905 IT - 16185 11230 13010 11880 12435 P 16880 11745 12900 12305 . 13690 K 15435 II765 11015 11370 12155 NP 16620 12835 12365 12215 12445 NK 16010 11410 12040 12100 12480 PK 16955 12230 12640 12960 13035 NPK 15630 II665 11050 12120 12055 (1) (2) (3) (*0 (5) (6) The sums of squares can be computed as follows for the treatment effects (for 1 d.f .) N = 1/2 k [(N + NP + NK + NPK) - (0 + P + K + PK)] 2 P = 1/2 k [(P 4. NP + PK + NPK) - (0 + N + K + NK)J 2 K = 1/2 k E(K + NK + PK + NPK) - (0 + N + P + NP)] 2 N x P = 1/2 k [(N + P + NIC + PK) -(0 + NP + K + NPK)] 2 228 N x K = 1/2 k [(N + K + KP + PK) - (0 + P i M + NPK)] 2 P x K = l/2 k [(P + K + MP + NK) - (0 + N + PK *■ NPK)] 2 H x P x K = l/2 k [(N + P + K + NPK) - (0 + NP + PK + NIC)] 2 For example, the interaction N x ? ia calculated from the replications in which it is not confounded, i.e., from Column 3, Table 9- Note that k = 12. N x P = l/24 [(11230 + 117^5 + 11410 + 12230) - (10595 + II763 + 12835 + 11665)] 2 = l/24 [ 1+6615 - 46860] 2 = l/2'+ [245] 2 = 60025/24 = 2501.04 Similarly, N x P x K = l/24 [(13010 + 12Q00 4 11015 +■ 11030) - (101.40.-*- 12365 4 12040 *• 12640)] 2 N x P x K = l/24 [(47973 - 47185)1 2 = 1/24 [790] 2 = 624,100/24 = 26,004 The main effects are calculated from all the replications; i.e., k - 16. The calcu- lation for N is as follows: N = 1/32 [(16135 + 16620 +■ 16010 + I563O) - (13803 4 16880 + 15435 + I6953)'] 2 = 1/32 ['64445_ 63075] 2 - i/32 [1370] 2 = l,37b,900/32 = 58,653 The total sura of squares is calculated from all plot yields in all replications of the experiment, i.e., 32 plots. The block sum of squares is computed from the 8 block totals. The ordinary method of computation is used. The analysis of variance can be set up as follows: Table 10. Complete Analysis for Partially Confounded 2x2x2 Experiment Variation Sura Mean due to D.F. Squares Square Blocks N P K 7 1 1 1 5, 786,912 58,653 675,703 9,112 326,702 53,653 675,703 9, 112 N x P N x K P x K N x P x K 1 1 . 1 1 30,817 65,626 26,004 2,301 36,8.17 65, 626 26'. oo4 Error 17 2, 393,572 l4o,9l6 Total 31 9^ 056,900 Obtained ' Value 5 Pet. Point 5 •Of 2 .40 k .79 15 .46 3 •33 2 .15 5 .42 2.70 243.91 4.43 243.91 243.01 243.91 243.91 243.91 In this experiment, information is obtained on the main effects and on all interact tj.ons, including the second order interaction. However, there is a loss of one-fourth the information on each of the interactions, due to the fact that the replication in which an interaction was confounded was omitted in the calculation of its sum of squares. The error is of approximately the same magnitude as that for the experiment in which N x P x K was completely confounded. 229 References 1. Fisher, R. A. Design of Experiments, pp. 96-I37. 1935. 2. Goulden, C. H. Methods of Statistical Analysis. Burgess Publ, Co., pp. 107-120. 1937. 3. Wiebe, G. A. Variation and Correlation in Grain Yield Among 1500 Wheat Nursery- Plots. Jour. Agr. Res., 50:331-357. (Source of Data). 1935. k. Yates, F. The Principles of Orthogonality and Confounding in Replicated Experi- ments. Jour. Agr. Sci., 23:108-1^5. 1933. 5. Yates, F. Complex Experiments. Suppl. Jour. Roy Stat. Soc, 2:181-2^7. 1935. Questions for Discussion 1. What is a factorial experiment? Give an example. 2. Under what conditions may a factorial experiment he used? 3. What is meant "by the term "orthogonality"? Give an example of an orthogonal experiment. k. Explain the use of the term "confounding". What is done in confounding? Why? 5. Suppose a second-order interaction, N x P x K is to be confounded. How can this he done "by design? 6. What is partial confounding? How does it differ from confounding? Problems Some uniformity data presented by Wiebe (1935) on wheat yields in grams per row are presented below as they occurred in the field: Plot Blocks No. I II III IV 1 670 690 785 6U5 2 685 790 770 665 3 660 825 960 750 h 705 805 860 635 5 610 720 705 615 6 6^0 735 805 665 7 690 855 905 700 8 715 765 9^5 820 1. Calculate these data as a randomized block experiment using the 8 fertilizer treat- ments given in the text example. 2. Design an experiment so as to confound the second order interaction, N x P x K. Carry through the complete analysis. Compare the results with those obtained in problem 1. 3. Design an experiment to superimpose on these data so as to partially confound the second order interaction (N x P x K) . Carry through the complete analysis. Com- pare the results with those obtained in problems 1 and 2. CHAPTER XX SYMMETRICAL INCOMPLETE BLOCK EXPERIMENTS I. Incomplete Block Teste It has "been shown (Chapter 19) that greater accuracy is obtained in factorial ex- periments when certain degrees of freedom for the higher-order interactions are con- founded with "blocks, especially when the number of combinations is large. In varie- ty trials it is sometimes desirable to test a large number of varieties in a single experiment. To compare them in an ordinary randomized block test leads to less accuracy due to the large size of the blocks. Methods have been developed by Yates (1936, 1937) to overcome this difficulty. The procedure is analogous to confounding in factorial experiments in that the replications are divided up into smaller blocks which are used as error control units. These small blocks contain only part of the total number of varieties, hence the name "incomplete blocks". Incomplete block experiments have been shown to give increased efficiency by Yates (1936}, and Goulden (1937). Weiss and Cox (1939) found the lattice square arrange- ment to result in a gain of l';0 per cent on extremely heterogenous soil, but a loss of precision of 3I>5 per cent on a very uniform soil. One type of incomplete block experiment will be illustrated, i.e., the symmetrical incomplete block where all possible groups of sets are used. The computation pro- cedure will follow closely that described by Weiss and Cox (1939) • For other types of incomplete blocks, Goulden (1937? 1939) should be consulted. These include the two dimensional quasi -factorial with two groups of sets, and the three dimensional quasi -factorial with three groups of sets. An excellent discussion of the lattice square design (quasi-Latin squares) is given by Weiss and Cox (1939) who applied it soybean variety test. The computations will be illustrated with some uniformity trial data obtained from Dr. R. M. Weihing on forage yields of crested wheat grass expressed in kilograms. The plots consist of 3-^ows, 15 feet long, the individual rows being 6 inches apart. I I . Design of Symmetrical Incomplet e Block T ests In order to determine the details of an acceptable design with regard to the number of varieties and blocks to use, it is necessary to satisfy the condition that each variety occur with every other variety in the same number of blocks. Suppose that m varieties are replicated n times over a portion of the available blocks each of which is to contain n' plots. For example, suppose that one considers the n plots in which one certain variety occurs. The total number of plots contained in these n blocks is obviously (n)(n'), of which n corresponds to the one variety under con- sideration. Therefore, there are (n)(n'-n) = (n'-l)(ri) plots available for the other m - 1 varieties in those blocks. To meet the above condition, these (n'-l)(n) plots must be distributed equally among the m-1 varieties that remain. For this reason, (n'-l)(n) must either equal m-1 or be a multiple of ja-1 . Thus, it becomes apparent that m-1 = (n'-l)(n) is a number chat must be; factorable, preferable into two numbers of nearly equal size. This can be effected in two different ways. (1) First, one may use m = k : -, where k is an integer, from which m-1 = k2-l = (k-l) (k+1). From this, it would appear that the choice of design could be either k-l = n' - 1 (i.e., n' = k whence n will be k + 1), or k + I * n' -- 1 (i.e., n' « k + 2) in which case n will be k-l. However, when in' is equal to the total number of blocks, one must have ran = m'n' . Thus, it is clear that mn must be divisible by n 1 . The first choice gives mn to be k^(k + 1) in which the divisibility is assured with -230- 231 m' = k(k + 1) . The second choice gives ran a k£{k>lj, which generally would not he n 1 k + 2 an integer. Thus, only the first choice is acceptable. (2) Second, one may choose m = k 2 - k + 1, from which to - 1 = Is^-k = k(k-l). From this relationship, it appears that one has a choice of design by the use of either k - 1 = n" - 1 (i.e., n' = k) from which n will also be k, or k a n»-l (i.e., n' = k + 1) in which case n will be k - 1. In the analysis of this situation, mn = (l^-k+Dk . The divisibility is assured with the result that m' = k^-k+l. For the k second choice, mn = k2-k+l , a value that is not generally divisible. Thus only the n» k + 1 first choice is acceptable. Therefore, it is obvious that designs of this nature can be constructed for m = k 2 where m varieties = 9>l6,25,36,49,64, etc. The k^-k+1 type can be designed for values of m = 7, 13, 21, 31> 43, 57 j 73 > etc. The structure of the arrangements is rather fully discussed by Yates (1936), Fisher and Yates (I938), and by Goulden (1937, 1939). The first type, 1 = k^ s n'2, will be used to illustrate the process for a complete- ly orthogonal i zed 5 by 5 square. This will give a series of symmetrical incomplete block arrangements. 1111 (1) 2222 (2) 3333 (3) Ij-ii-U^ w 5555 (5) 2345 (6) 3^51 (7) 4512 (8) 5123 (9) 1234 (10) 3521+ (11) *U35 (12) 3241 (13) 1552 (14) 2413 (15) 42<>3 (16) 531^ (17) 1425 (18) 2531 (19) 3142 (20) 5432 (21) 15^3 (22) 215^ (23) 3215 (24) 4321 (25) The explanation of the arrangement is taken directly from Weiss and Cox (1939)- The numbers in parentheses designate the varieties which are to be- compared in the- ex- periment. "These variety numbers are arranged in 6 orthogonal groups as follows: Group I Group II Group III ( rows ) ( Columns \ (first Number) 1 2 3 k 5 1 6 11 16 21 1 10 14 18 22 6 7 8 9 10 2 7 12 17 22 2 6 15 19 23 11 12 13 14 15 3 8 13 18 23 3 7 11 20 24 16 17 18 19 20 4 9 14 19 24 4 8 12 16 25 21 22 23 24 25 5 10 15 20 _25 J_. . 9 15 17 21 Group IV Group V Group VI ( second number) 12 20 23 1 (third number) 8 15 17 24 (fourth number) 1 9 1 7 13 19 25 2 10 13 16 24 2 9 11 18 25 2 8 14 20 21 3 6 14 17 25 3 10 12 19 21 3 9 15 16 22 4 7 15 18 21 4 6 13 20 22 4 10 11 17 23 5 8 11 19 22 5 7 14 16 23 5 6 12 18 24 In group I the variety numbers are copied from the rows of the square, each row of the group specifying a block in the field. In like manner, the variety numbers in the blocks of group II are taken from the columns of the square. In group III the varieties in a block are specified by the numbers written first in the cells of the square. Thus, the varieties in the first block are those corresponding to number 1 wherever it occurs first in tho cell; as examples, variety 1 is from row 1 column 1 of the completely orthogonalized square, variety 10 from row 2 column 5^ variety 232 lh from row 3 column h, .etc. "For group IV, the second numbers in the cells of the square ere used to pick out the- varieties . Thus, for the third "block the number 3 is located in row 1 column 3 (variety y) , in row 2 column 1 (variety 6), etc. "This set of six orthogonal groups constitutes a balanced incomplete block arrange- ment: in the 30 blocks of 5 plots, each of the 25 varieties occurs 6 times, once and once only with every other variety. The combination solution in the unreduced form would require a prohibitive number of blocks^". The field arrangement for this typo of symmetrical incomplete block design will be illustrated with the crested wheatgraas data. There are 25 varieties arranged in 6 replicates with 5 varieties in each block. The 5 varieties ere randomized within each block. The block and replicate arrangement in the field may be as follows cV I II III IT V VI 5b 6b 13b 20b 2Vb 27b 1ft 7b lib I8b 25b 29b lb 10b 12b 1 . lob 21b 26b 2b 8b lift • 19b 22b 23b Jb 9"b 15b 17b 23b 30b III. Statistical Analysis of Incomplete Block Data The symbols used in the discussion follow: m = number of varieties (25) n' = number of plots per block (5) n = number of replicates of each variety (6) ia' = number of blocks (30) N = inn = m ; n' = total number of plots (150) X = n(n' - 1) = number of times any 2 varieties occur together in a m-1 block (1) E = 1-l/n' = Efficiency Factor of Design, (5) 1-T/m ' (6)' Sx = Sum of all N experimental values ( 217. 79) S 'x = Svsa. of n experimental values for any one variety. V S 'x = Sum of k experimental values for any one block. B_ s^ = Error variance of a single experimental value, ^"b =_Q__= C m n' 25 5 251 = 53, 130 > 5: 201 .vThe blocks (5b,Vb, etc.) were arranged consecutively for the analysis of the data used in this problem, but they should be randomized (at least within replicates) in an actual field experiment. The Roman numberals refer to replicates. 233 (a) Computation of Block Totals The yield data for the incomplete "block experiment may he assembled. as shown in Table 1 for the computation of the block totals. The numbers in parentheses refer to "varieties". The forage yields of crested wheatgrass are expressed as kilograms per plot . Table 1. Plot Yields of the Symmetrical Incomplete Blocks Assembled for 25 Crested Wheatgrass "Varieties" in 6 Replicates. Beplicate Set or. Block Plots in Block Block Totals II III IV VI o 7 8 9 10 n 12 13 15 16 17 18 19 20 21 22 23 2k 25 26 27 28 29 30 (5 (10 (11 (20 (21 (1 (12 (23 (19 (15 (18 (15 (20 ik (13 (l (21* (3 (* (22 (15 (9 (19 (22 (16 (1 (2 (22 (10 (6 1.25 1.38 1.1*2 1.20 1.1+8 1.86 1.61* 1.81* 1.33 1.50 1.00 1.62 1.60 1.30 1.56 1.1*8 1.1*8 1.1*8 1.1*0 1.50 1.22 1.62 1.32 1.12 1.08 (2 (7 (1* (18 (25 I.27 (16 1.92 (17 1.61 (3 l.Ol* (9 1.31* (25 (1 (2 (7 (12 (9 (20 (16 •(6 '(7 (5 (3 (2 (12 (20 (7 (25 (8 (9 (17 (18 1.52 1.1*8 1.1*0 1.32 1.1** 1.5 1 * I.36 1.16 1.1*2 1.38 1.9 1 * 1.61* 1.72 1.5^ 1.18 1.30 1.1*8 1.1*1* 1.31 1.1*1* 1.72 1.1*8 1.1*0 I.38 1.35 1.22 1.1*3 0.93 1.50 1.00 (3 13 17 23 21 (2 13 11* 10 11* 23 11 16 21 23 13 17 21 (8 17 11 21 (h (5 (7 (20 (3 (11 (5 1.30 (3 1.1*1 (9 1.35 (15 1.59 (16 1.16 (22 1.66 1.32 1.1*7 1.16 (21* 1.02- (5 (6 (22 (8 1.81* 1.66 1.20 1.16 1.21* 1.21* 1.66 I.60 1.1*5 1.31* (22 (19 (3 (8 (5 (12 (10 (11* (15 (11 1.68. (2l* 1.29(25 1.26 (3 1.1*2 (13 1.61* (23 1.70 1.50 1.1*6 1.13 (13 (21 (15 (k 1 .08 - ( 12 ) 1.83 (1) ) 1.52 (6) ) 1.32 (12) ) 1.21 (19) ) 1 -5k (21*) ) 1.81 (11) ) 1.38 (7) ) 1.21* (18) ) 0.92 w ) 1.22 (20) ) 1.96 (10) ) 1.78 (6) ) 1.29 (21*) ) '1.1*8 (25) ) "1.31 (17) ) -1.61* (9) ) '1.62 (2) ) '1.56 (25) ) 1.51 (18) ) 1.58 (19) ) '1.9^ (1) ) '1.36 (18) ) 1.1*1* (10) ) 1.7^ (6) ) l.oo (i*0 )' 1.56 (19) ) 1.1*1* (11*) ) 1.36 (16) ) 1.19 (23) ) 1.32 (21+) 1.61* 1.5k 1.1*6 7.25 1.19 6.68 1.22 6.51* I.67 6.99 I.96 8.21* 1.65 7.63 1.32 6.32 0.99 5.55 1.72 6.68 1.92 9.52 I.83 8.55 I.29 6.89 I.5I* 7.05 1.53 6.80 1.62 •6.80 I.83 ■8.21 1.81* S.oi* 1.72 7.29 1.25' 7.17 1.81* 8.66 I.67 7.28 1.86 7.1*1* 1.55 7.^9 I.32 7.^9 1.60 7.30 1.56 7.55 1.1*1 6.1*8 1 . 10 6.09 1.31 5.79 Grand Total 217.79 (h) Computation of Variety Mea ns In symmetrical incomplete block designs, a preliminary step is required to obtain the sum of squares for varieties. Due to the fact that variety differences are partially confounded with block effects, it is necessary to compute each variety sum by a formula that involves both the yields of the plots planted to the variety and the yields of the blocks in which the variety occurs. 23h The first step is to accumulate the variety sums which are recorded in table 2, column 2. The yields for each variety are collected from table 1. For example, the total yield of variety 1 is: S' - 1.61+ + 1.27 + l-9 ! + + 1.00 + 1,8k + 1.22 = 8.91 V For each variety total there is also a sum of "block total (S'S'x) which is recorded V B in table 2. Since variety 1 appears in "blocks 1, 6, 11, 16, 21, and 25, S'S'x = 7.ih + 8.2I+ -i- 9.52 + 6.80'+ 8.66 + 7. 30 » 1+8,06 V b Table 2. Computation of Variety Means for the Crested Wheatgrass Experiment with 2\ "varieties" in 6 Replications. 'arie"cy Block tots . Replicate Variety for each n'Sx - Q Variety II III IV V VI Totals S'x V V tq t v k> Q JL 25 Means V B Yields in Kg. S'x S 'S 'x V B Q d Sx/N * d 1 1.61+ 1.27 1.9!+ 1.00 1.61+ 1 i"» fj 8.91 1,3.06 -3.51 -0.1I+ 1.31 2 1.52' i.32 1.61+ 1.83 1.1+8 1.62 9 .1+1 1+6.76 +0,29 +0 . 01 1.1+6 3 1.83 1.16 I.29 1.60 1.1+1+ 1.1+6 O". (O •'+3.21 +0 . 69 4-0 . 03 1.1+8 1+ 1.30 0.99 1.33 1.30 1.1+2 1,19 7.53 1+0.99 -3.34 -0 . 13 1.32 5 1.25' 1.22 1.35 1.1+1+ 1.61+ 1.08 7.98 1+1.1+7 -1.57 -0.06 1.39 6 1.1+6 1.31 I.G3 1.1+1+ 1.55 1,08 9.17 1+5,36 4-0,1+9 +0 . 02 1.1+7 7 1 .US I.65 I.27 1.31 1.35 1.70 8.76 ^-3.83 -0.05 . 00 1.1+5 8 1.1+1 1.24 1.1+6 l.3k 1.72 I.H3 P. ,<o O . DC 1+I+.50 -1.1+0 -0 . 06 1.39 9 1.52 1.1+2 1.18 1.62 1.1:8 0.93 0.13 1+0 . Ik 40.61 4-0.02 1.1+7 10' 1.33- 1.02 1.92 1.62 1,86 1.12 O. no 1+5.19 -0.59 -0 . 02 1.43 11' L.l+2 1.96 1.20 1.38 1,29 1 i A ... j. 1^.. 3.63 ^2.3? +0.80 +0 . 03 l.i+3 12 1.19 1.92 I.5I+ 1 .61+ 1.U0 1.32 9.01 kl.39 *3-S6 +0 . 15 1.60 13 1.35 1.1+7 1.50 1.66 1.71+ 1,56 9.28 4 3.^.0 +3 . 10 +0.12 1.57 ll+ 1.1+0 1.16 1.81+ 1.56 1.32 1.56 8.81+ 1+1+.81 -0.61 -0.02 1.1+3 15 1.32 1.3^ 1.61+ 1.51 1-.1+8 1.36 8.65 kk.3k -.1.09 -0.01+ 1.1+1 16 i .21 1.5k 1.16 1.1+8' 1.50 i.i+i 8.30 1+1+.01 -2.91 -0.10 1.35 17 1.59 1.36 1.33 1.60 1.68 1,50 9.36 U3.76 ■i-2 . 54 +0.10 1.55 1 1.32 1.32 1.86 1.72 1.67 1.00 8.89 1+3.24 -1.21 +0 . 05 1.30 19 ' 1.22 1.01+ 1.78 1.25 1.1+8 1.60 8.37 1+2 . 53 -0.63 -0.03 1.1+2 20 1.20 1.72 1.81+ I.30 1 -2.0 1.56 8.91+ 1+1,95 +2.75 +0.11 1 . 56 21 1.1+8 1.66 1.21+ 1A5 1.26 1.1+1+ 8.33 I+1+.31 -1.66 -0.07 I.38 22 1.5^ 1.38 1.96 1.36 1.1+0 1,32 9.16 1+5.23 +0 . 52 +0 . 02 I.V7 23 1.16 1.63 1.66 1 .21+ 1.68 1,10 ■8.47 1+2. 7I+ -0.39 -0.02 i.1+3 2k 1.67 0.92 1.29 1.62 I.9I+ 1.31 8.75 1+2.07 . +1.68 +0 . 07 1.52 25 ' 1.11+ 1.3S 1.5k 1.81+ I.36 ] 00 8.1+3 1+3.31+ -0.9^ -0 . 01+ 1.1+1 Totals )* 217.79^1088,95 0.00 0.00 \!/The sum of the S'x column (217 .79) is equal to Sx, while the sum of the S'S'x column V V B is equal to n'Sx. Therefore, the computations car. be verified: (5) (217. 79) = 1088.95. 235 For the computation of Q, the "block sums are subtracted from 5 times the variety- totals, i.e., Q = n'S'x - S'S'x V V B For example, for variety 1, Q = 5(3.91) - bQ.o6 = -5.51 The Q value is then divided "by the number of varieties in the test (25) to give the values for d in table 2. Thus, d is the deviation of a variety mean from the mean yield of all the varieties in the experiment. The best estimate of the variety means is Sx/N + d. As an illustration, the mean of variety 1 is, Sx/N + d = 217.79/150 + (-0.1*0 = 1.31- In the variety means, consideration has been given to the effect of partial confound- ing of variety differences with block effects. They are the best estimates of ■ the yield performance. (c) Derivation of Sums of Squares The sums of squares may now be computed. The correction factor is the square of the total divided by the number of plots, viz., (Sx) 2 = (217. 79) 2 = klM2.kQkl = 316.22 N 150 150 The total sum of squares is obtained in the usual manner, i.e., by the addi- tion of the squares of each individual plot yield with the correction factor sub- tracted: (1.25)5+ (1.52)2+ +(1.31)2-316.22 = 8.23 - The sums of squares between means of blocks is obtained by the addition of the squares of the block totals, these being divided by the number of plots which make up each block total. The correction term is subtracted from this value. (7.5*0 2 + (7.25) 2 z (5-79) 2 5 316 .22 fc.17 The sum of squares between means of varieties is obtained from each Q value squared, added, and divided by N: (-3.51) 2 + (0.29) 2 + +(-0.9*Q 2 = 0.56 150 The analysis of variance is presented in table 3. Table 3« Analysis of Variance of Symmetrical Incomplete Block Design Source of Variation D.F., Sum of Squares Mean Square F -Value Blocks Varieties Error Total 29 2k 96 U.17 0.56 3-50 8.23 ..1U58 0.0233 .0.0365 3. 9'+** 1.57 2% The standard error of the plot yields is s = /O.O365 = 0.19 kilograms The standard error of the difference "between two of the corrected means will "be 2sf n n' + 1 n' = /( 2) (O.O565) 6. = 0,12 IV. Efficiency Factor The symmetrical incomplete "block design is less efficient than the complete ran- domized "block arrangement for equal numbers. of replications when the soil is homo- genous. This is because there has been no reduction in error variance duo to the re- duction of block size. The efficiency of the incomplete "block design as compared to randomized complete blocks is expressed by the fraction, 1 - 1 /n ' , when the rcplica- 1 '- 1/m" tier, numbers in each arrangement are equal. In soils that are heterogenous the re- duction in block size usually more than compensates for the loss of information due to the arrangement. G-oulden (1937) concluded that an increase of precision of 20 to 50 per cent was obtained over the complete randomized block arrangement. In addition to the doubtful value of the symmetrical incomplete block design on very uniform soils, Weiss arid Cox (1939) advise that the design not be employed to com- pare varieties which have an extremely large range in yields, However, poor varie- ties are usually eliminated in preliminary trials. The symmetrical incomplete block arrangement would provide a means to accurately determine relatively small differ- ences between select varieties. Gculden (1939) gives a list of the n 1 and n values for different numbers of varieties for which symmetrical incomplete blocks may be used: Wo. Varieties 13 16 21 25 31 h9 57 6k 73 No. Plots in one block (n* ) h k 5 5 6 7 8 3 9 No. Replications for Each Variety (n) 6 6 8 8 9 9 References 1. Fisher, R, A. The Design of Experiments. Oliver and Boyd, 2nd Ed. pp. 100-171. 1937 • 2. Fisher, R. A., and Yates, F. Statistical Tables for Biological, Medical, and Agricultural Research. Oliver and Boyd. 1933. 3. G-oulden, C. H, Efficiency in Field Trials of Pseudo-Factorial and Incomplete Randomized Block Method. Can, Jour, Res. C, 13:231-2^1. 1937. ^1 Modern Methods for Testing a Large Number of Varieties. Can. Dept, Agr. Tech. Bui. 9. 1937 . 237 5. Goulden, C. H. Methods of Statistical Analysis. John Wiley, pp. 172-202. 1939. 6. Weiss, M. G., and Cox, G. M. . Balanced Incomplete Block and Lattice Square De- signs for Testing Yield Differences among Large Numbers of Soybean Varieties, la. Agr. Exp. Sta. Res. Bui. 257. 1939. 7. Yates, F. Complex Experiments. Suppl. Jour. Roy. Stat. Soc, 2:181-247. 1935. 8. Incomplete Randomized Blocks. Ann. Eugenics, 7:121-140. 1936. 9« A Further Note on the Arrangement of Variety Trials: Quasi-Latin Squares. Ann. Eugenics, 7:319-332. 1937. 10. The Design and Analysis of Factorial Experiments. Imp. Bur. Soil Sci. Tech. Comm. 33. 1937. ' Questions for Discussion r 1. Why is. an ordinary randomized block design inaccurate for comparisons of a large number of varieties? 2. What principles are involved in incomplete block tests? What is a symmetrical (or balanced) incomplete block? 3. Explain how to write out the sets for a completely orthogonal i zed 5 by 5 square. 4. What variations in field lay-out are permissable with a symmetrical incomplete block test? 5. How does the computation for variety sums of squares differ from that for an, ordinary randomized block? 6. What is the efficiency factor? Compare the efficiency of a symmetrical incom- plete block test with that for a randomized block trial. 7. What are the limitations in the use of the symmetrical incomplete block design? 8. How would you arrange a variety test so as to be able to fit 47 varieties into a symmetrical incomplete block test? Problems 1. It is desired to conduct a symmetrical incomplete block test for lo varieties of wheat. The form to be used will be m = n'2. A 4 by It- orthogonal! zed square is given below. Write out the sets for the different blocks for each replicate* 111 234 3U2 423 222 143 431 31^ 333 [ :-12 124 2 lH 444 321 213 132 2. Some uniformity trial data on wheat nursery plots were as follows in grams per 15 foot row (Data from Dr. G. A. Wiebc) : 695 860 960 725 615 735 910 Q75 775 680 645 745 815 700 605 630 $10 730 635 535 680 745 840 730 645 620 730 775 680 610 620 745 660 565 520 560 675 690 635 525 625 706 725 6fc 645 700 765 725 615 640 685 78^ 655 6c6 570 625 556 • 590 590 605 745 790 675 600 625 680 . 670 630 64o 645 655 730 615 650 640 625 700 675 720 695 Use the incomplete block sets written for problem 1 and apply the above yields to them. Calculate the data for a synmetrical incomplete block design. CHAPTEB XXI MECHANICAL PROCEDURE IN FIELD EXPERIMENTATION I. General Considerat ions The experimental farm should "be kept neat, clean, and in order at all times. Weeds should "be hoed from plots anil allays and all trash destroyed. Alleys and roadways should he hoed or cultivated unless seeded to grass. Straight plot rows add to the general attractiveness and in some cases to accuracy. (a) C rop Potation Scheme Zavitz (1912) states that it is essential to havo a rotation plan for the entire experimental farm in order to maintain soil fertility. In addition, accurate maps should he kept for the different fields so that a continuous record exists as to the crops grown on each field for all past years. A rotation scheme prevents mix- tures in small grain nurseries as well as on other plots since volunteer grain may contaminate seed plots where the same crop was on the land the previous year. -On the Colorado Station farm it has been found advisable to fallow some of the fields to equalize the soil moisture duo to the effect of irrigation and for weed control. How- ever, many experiment stations prefer that a bulk crop always precede nursery plots. At the Nebraska station fallow has failed to equalize soil conditions, (b ) P reparat ion o f Land for Experimental Crops All plots for field trials should receive similar treatment except where the treatment itself is under study. Cultural operations should be at right angles to the direction of the plot rows so far as practicable. Thome (I909) states that fertilizers should be applied by machinery rather than by hand methods because of the more uniform distribution. A two-way plow is useful in seedbed preparation as a means for the elimination of dead -furrows and back -furrows in the middle of the ex- perimental area. Seeding machinery used in experimental work must be accurate and, for that reason, should be calibrated wherever possible. Many machines are unfit for such work, A drill that fails to drop seed i:.niformly may cause a serious error in field, plot yields. Moreover, it is very desirable to have a, drill that can bo cleaneC out readily. Plot rows should be made straight because crooked rows cause irregulari- ty in plot shape. A -- Methods for Planting Experimental Crops EI ■ S eed P r epar at 1 on The best sued obtainable should be used in variety trials, i.e., pure as to variety, free from weed seeds and foreign material, high germination, and uniform in size. (a) Seed Source Seed from entirely different sources may entirely upset the small differences commonly found in yield trials . All seed used in such trials should have been grown, harvested, and stored under uniform conditions for at least two years, according to Engledow and Yule (I926) . This is usually impossible. Under those conditions, Par- ker (1931) advises "all that can. be lone is to see that the seed of the several varieties is approximately of equal germination and. is equally sound and healthy in other ways. 1 ' Adapted seed is highly desirable for self -fertilized crops and often even more so in cross fertilized crops like corn. ■258- 239 (t>) Other Considerations Unless disease reaction is under study, seeds of cereals should be treated for control of fungus diseases such as smut. New Improved Ceresan, a dust treatment, may he used at the rate of one -half ounce per bushel for the covered smuts. All seed should he of the same age when possible. It should he weighed out for the particular test on the same scales, especially when planted by weight per unit area. The procedure on many stations is to measure out the seed for both nursery and field plots. For rod-row or nursery trials the seed is placed in coin envelopes and num- bered to correspond to the plots. When a drill is used, a little more than enough seed is desirable because the drill itself measures the seed planted. III. Rate of Seeding Considerable error may be introduced in some crops through variation in rate of seed- ing . (a) Small Grains In small grains the investigator must either plant equal weights or equal numbers of seeds per unit area. Up until 1910, the "centgener" method was extensive- ly used in small -grain nurseries for the determination of yields. The kernels were space-planted 10 inches apart each way in blocks and contained 100 seeds. Aside from the theoretical objections in genetics, this was an absurd practice from the stand- point of field yields because the seeds were planted approximately lk times as far apart as ordinarily occur in a drill -planted field. In addition, a great amount of detailed hand labor was required. The method has been discarded in this country in favor of the rod -row. (1) Rod -Row Trials: The general procedure in rod rows is to measure the seed per row. Kiesselbach (1923) summarizes the situation very well. Fortunately, he states, there may be considerable variation in the rate of seeding without material effect on the yield per acre. For instance, Turkey wheat planted at 3,1*-, 5, 6, and 8 pecks per acre at Nebraska yielded 22.2, 2^.6, 23.7, 2k. h, and 2k. 5 bushels per acre, respectively, for 9 years. Seed of average size, or screened seed, should be used for machine planters. Measurement of the seed gives results more com- parable with field conditions than where individual seeds are space planted as in the centgener method. Seed for hand -planting should be weighed. (2) The "Checker- board" Trial: The English workers use the "checkerboard" to some extent in their variety trials . It is essentially a modified centgener plan in which the seeds are spaced 2x6 inches apart. They admit it differs from field conditions and, for this reason, use larger "observation" plots to supplement the checkerboard trials. The checkerboard is precise but requires too much time and labor where many varieties are under test . (b) Other Crops Corn is generally planted by farmers in rows 3.0 to 3.5 feet apart. The usual rate is three plants per hill for checked corn or with single plants 1^ inches apart when drilled in the row. Under dryland conditions, the plants are usually drilled 20 to 30 inches between plants in the row. This is the practice in experimental work except that the seed is often planted at double the required rate, later thinning the plants to the desired stand. Without this precaution, Kiesselbach (1928) points out, competition between adjacent rows that differ materially in stand may lead to faulty results. In sugar beets the seed is generally planted very thick. They are later thinned tc the desired interval between plants, usually 12 inches. Sugar beets are ordinarily planted in rows 20 inches apart . IV. Methods to Plant Field Plots The ordinary grain drill is often used to plant field plots of 3mall grain and forage crons . 2^0 (a) C alibration of Grain Dr ills The necessity for drill calibration was shown "by the work of Bonnet t and Burkart (1923). The drill may "be jacked up, the seed rate set as desired, and the wheels turned 30 revolutions at the rate they would turn over in the field. The amount of grain collected for each drill should "be weighed. A mark should he made on the wheel to facilitate the coujit . It is only a matter of arithmetic to calculate the rate that the seed will be planted. ( b ) Use of the Drill For small grain and forage crops the different replications of the same variety should be planted before the seed is changed. The plots may be staked out in advance to facilitate this procedure. The drill should be thoroughly cleaned out between varieties, possibly by the aid of an air bellows to dislodge seed in the corners of the drill box. Some drills are made over so that the seed box can be tipped f orward on hinges to empty. In some experiments where two kinds of seed are planted in a plot, one crop may be drilled in one direction and the other at right angles to it, e.g., nurse crop studies in alfalfa. V. Meth ods to Plant Small Grain Nursery Bows Small grain nursery plots involve hand methods after the seedbed has been prepared. Eod rows 12 inches apart are generally used. At some experiment stations 18-foot rows are planted, being trimmed down at harvest time to 16 feet for wheat, 20 feet for barley, and 15 feet for oats. This enahles the investigator to convert the yields in grams per plot into bushels per acre by the use of a simple factor. The rod rows may be made by the use of a sled marker with the runners spaced at the pro- per intervale, the ideal type being horse drawn. The rows are then opened with a wheel -hoe for hand planting. Another method is to use a sugar beet cultivator with bull tongs spaced at the proper intervale. This has proved to be very satisfactory at the Colorado station. A 12-inch furrow drill is used to mark out the rows on the Akron Station. The seed, previously weighed out, is sometimes hand-planted (scattered) in the row. A Columbia or planet Jr. planter is used in many cases to plant wheat. Modification of the Columbia drill for planting oats and barley has been suggested by Woodward and Tinge.y (1953) as well as by Jodon (1932). A rapid method for plant- ing is by use of the spout drill. This is very satisfactory for genetic material where yield is not a factor. The grain is poured through the spout, all seed in the packet being planted in the row length. After a little experience the seed can be planted very unif orm."i-y . One man pushes the drill while another drops the seed. The spout-drill may be used for space -pi anting after a little practice. One station that uses 3-row plots for nursery studies has a horse-drawn planter. A convenient method for space -planting small grains at definite intervals, for example two inches, is to take a 6-inch board and bore holes at the proper intervals. The seeds are dibbled in these holes . VI . Met hods to Pl ant ."Row Crops Corn will be taken as an example of a row crop. Generally a horsu or hand -drawn marker is used to mark the distances between rows. When the corn is to be •check- planted in hills the plots are cross -marked to give a set of squares, the intersec- tions designating the hill locations. Hills are generally spaced y.O or 3*5 feet apart in all directions. Suitable alleys should be left between blocks to facilitate cultural and harvest operations. The stakes are distributed along one end of the plots. The seed sacks or envelopes, with the variety number on them, are distributed to correspond with the stakes. The numbers should be checked against the planting plan to avoid mistakes. The seed sacks may be re -distributee for each replicate. Corn is generally planted with a hand planter in yield trials. One of the most 2hl satisfactory planters is a made-over potato planter.* It is constructed to have a long, full-length tin sleeve into which the proper number of kernels is dropped into the shoe. A nail sack is convenient for carrying the seed. For planting six kernels per hill, in order to thin to three plants later, it is convenient to plant three kernels each in two jabs about one inch apart. This facilitates thinning. B — Field Observations and Care VII. Value of Field Observations Intimate knowledge of experimental plots is extremely desirable. In fact, observa- tions during the growing period of crop may be as valuable as the yield data. Dif- ferences due to disease, irregular loss of plants, etc., may account for the varia*-. tion in yield. Plot observations should be made at regular intervals. Notes should be entered in the field book at once while clear and vivid in the mind. Word descrip- tions should be clear and precise, being made as comparisons in terms of the check when possible. Sometimes sketches, diagrams or photograx^hs are a better method of expression than word descriptions. VIII. Measurement of Plant Character s Field counts or measurements on certain plant characters given in numbers or cate- gories make excellent comparative field records. Formal "score cards" are apt to make observations perfunctory. Hence, records should depend upon the particular crop and the needs that may arise. Some of the more important characteristics usually recorded are as follows: date emerged, stand, winter survival, date ripe, plant height, lodging, barren stalks (in com), disease infection, etc. Some of these may be taken in quantitative measures while categories are required in other cases. When actual counts are out of the question, a scale of metrics may be employed to convey the relative intensity of attack of a disease or insect pest. The numbers 1,2,3, and k may be used to represent, respectively, a slight, moderate, bad, or very severe attack of rust, mildew, etc. A scale of 1 to k is generally adequate for categorical data. Further sub-division merely leads to confusion. A very good rust scale is available in the agronomic field book used by the Division of Cereal Crops and Diseases, U.S.D.A. Yates ( 193*0 reports a bias between different observers when a large number of counts were made on wheat culms. The bias differed from observer to observer and from sample to sample. The same individual should make all counts or at least all counts on a single replication in order to avoid this form of systematic error . IX. Stand Counts and Estimates In certain crops stand counts are valuable, but this depends largely upon the experi- ment. In forage experiments the counts are often made by the use of square yard or meter quadrats. These may be permanent quadrats in perennial crop studies. In the case of winter or spring survival counts in winter wheat, the stand percentage is usually estimated except in special tests. One person should make all the estimates due to the large personal error invariably introduced when more than one person makes them. Estimation in categories such as good, fair, and poor stands may be satis- factory. In plant -survival studies, as in winter wheat, a more precise method would be to space plant the seed in rod rows at 2 -inch intervals. However spaced plants have been observed to kill worse than seeded material. Such tests are valueless for yield. *Note': The type used at the Nebraska, Minnesota, and Colorado stations is the "Acme Segment" potato planter manufactured by the Potato Implement Co., Traverse City, Mich. It can be slightly modified to make an excellent planter. 2^2 X. Date Headed There is considerable variation among workers as to the date when a crop should he considered in head. In small grains, date in head is usually a more reliable index of earliness or lateness than date ripe. This is particularly true under dryland conditions where winds may prematurely dry up a variety rather than to allow it to ripen normally. In wheat, oats, and "barley, some investigators take notes on first heading, i.e., when 10 per cent of the heads are out of the hoot. A plot is consider- ed fully headed out "by some workers when 75 P®r cent of the plants in the plot are in full head. Others use a standard as follows: (1) Oats, when the heads are half out of the hoot; (2) Barley, when the beards are out of the boot; and (3) Wheat, when the heads show out of the boot. Date in silk or date in tassel are common . notes in com, date of silking being regarded as a more reliable index of relative maturity than date of tasseling. It is usual to determine the silking date and con- vert the data to the number of days from planting to one-half silking. The plots should be gone over at intervals of one or two days when date in head and similar notes are taken because some dates nay have to be moved up and others back. XI . Per cent Lodge d Data on the differential lodging of small grains is desirable as a measure of stiff- ness of straw. Sometimes after heavy rains or irrigations the soil may be loosened so that the entire plant falls over. This is not true lodging. A plant has an inherently weak straw when it bends or breaks over. It is often difficult to arrive at inherent differences because of soil heterogeneity and its influences. A variety should be considered lodged when the straw leans an angle of h*) degrees or more be- cause, for practical purposes, grain lodged to such an extent is difficult to harvest. The per cent of grain so lodged is usually estimated regardless of the cause. Plow- ever, Straw weakness can be detected before the plants lean to an angle of h^ degrees. Some investigators make notations as to whether the straw is apparently weak, medium, or strong, and denote the condition categorically by V, M, or S. Under irrigated conditions, small grains may bo irrigated heavily after heading to induce lodging. In corn, the relative resistance to lodging is often reported as the percentage of plants erect at harvest. The percentages may be computed from counts of the numbers of plants erect. In the interest of uniformity a. plant should be considered erect when it has not leaned more than 30 degrees from the vertical and which does not have the stalk broken below the ear. For those who wish to take more detailed records on lodged plants, it is suggested that such plants be separated into those lodged because of weak roots (leaning and down plants), and those lodged because of weak culms (plants broken below the ear) . XI1 - Plant Height Two men are required to take plant height notes readily, one to make the measurements and the other to record the results. In the case of small grains such measurements are ■ generally made just before harvest. Sometimes one measurement is taken per plot while, at other times, several plants are measured at random. One measurement per plot is enough when the heights are uniform. A convenient rule is a 1 x 1-inch stick marked, at one-inch intervals to 60 inches. Height notes in corn are often taken in the fall, but can be taken almost any time after the plants have tasselled out fully. It can be accomplished with an ordinary rule about 12 feet in length, 2.5 inches wide, and marked at 3 -inch intervals. XIII. Sogui ng Pl ots Small grain plots should be thoroughly rogued for admixtures before harvest. The plots should be gone over several times, particularly when the plants begin to head 2^3 or ripen. Rogues are most conspicuous at such times. It is difficult to rogue "bar- ley out of oats "because the oat plants are generally taller than barley plants. Care- ful work is required to rogue off -varieties and off -types within a crop. These can- be detected most readily by observed differences in culm height, date of heading, color of leaves, date ripe, and whether or not awns are present. It is a safe rule to pull all plants that fail to conform to the majority of the plants in a plot. XIV. Date Ripe The date on which a crop ripens is important, particularly in small grains where earliness is often a desirable feature. Some of the criteria used are given below. (a) Wheat • The grain may "be considered ripe when it is hard in the morning. The straw color is not always a reliable criterion of ripeness.. Those who use straw color as a criterion generally consider the grain ripe when the first nodes "below the heads on the main culms have turned brown. (b) Other Cro ps In oats the plot is usually considered ripe after practically all of the heads have turned yellow. The barley crop is generally considered ripe when all . green has disappeared from the heads. It is difficult to estimate date ripe on small grain that is badly rusted or lodged as it tends to ripen unevenly and often prema- turely in the case of rust. In corn, date in silk is usually regarded as a more reliable index of maturity than ripening data in the fall. C -- Methods of Harves ting Experim ental Crop3 XV. Difficulties in Harvestin g The time of harvesting crops often presents difficulties. Parker (1931) mentions that one might question the fairness when an early small grain variety is compared with a check variety that may ripen 10 days or more later. As a rule, plots are harvested as the varieties ripen, particularly, where there are wide differences in time of ripening. In some parts of the country, the investigator may be able tc wait until the latest varieties are ripe so that the entire field may be harvested at once. Except for extreme differences in time of ripening, it is usually possible to allow the early varieties to stand without particular damage to them. It may be desirable in some instances to carry out two separate trials, grouping the early varieties in one and the late ones in the other. In the case of root and tuber crops, all varieties may be left in the ground and harvested at the same time without serious consequences. 'The problem in corn is rather simple because all varieties are left in the field after becoming ripe so as to dry out. In forage experiments, in- clement weather may interfere with the- curing process and require that the hay be turned several times. As a result, it may dry out unevenly or the leaves shatter. A possible error in weight might result. XVI. Methods of Harvesting Field Plo ts The use of farm machinery is often anticipated for large field plots. (a) Small Grain Plots Small grain field plots should be gone over carefully before harvest to be certain that there are no errors due to defective drilling, rodent, or other injury that might 'influence the yields. When small grains are badly lodged, it may bo. necessary to separate the varieties along the margins and push them over into their respective plots before harvest. Kiesselbach (1928) uses a binder equipped with an 2hh engine that operates the working parts . At the end of the plot, the horses are stopped hut the engine continues to operate and clean cut the hinder. In the ab- sence of the engine it is necessary to crank the platform and elevator canvasses by hand. The small grain shocks should he placed well within the plot to prevent chance mixtures with adjacent plots should wind scatter some of the bundles. At some sta- tions the bundles are shocked on alternate ends of the plots. The shocks may he tied with binder twine to minimize the risk. When birds are numerous, shock covers should be provided. They can be made by sewing together ordinary burlap feed hags. (b) Corn Yield Tests The entire plot can be harvested without appreciable error when the plant stand is 90 per cent or better. Otherwise, it is advisable to reject at harvest all hills with less than the normal stand, and calculate yields on a perfect-stand basis. Usually the imperfect -stand hills are cut with a corn knife and removed from the plot. A record is then made of the number of perfect -stand bills that remain. Sometimes counts on barren stalks, 2-eared stalks, suckers, smutted plants, and lodged plants are made at this time. For small yield trials, actual harvesting can be done con- veniently with an apple-picking ba.g. For large field plots, Kiesselbach (1928) uses a wagon with a flat rack with partitions built on it. A partition may be placed lengthwise, through the center and each divided, for instance, into three partitions where three center rows are harvested for yield (as in p-row plots with border rows discarded) . This allows a separate compartment for each row. Three men can husk, one man being on each yield row. The compartments on the other side can ho used for the next plot on the return. At the end of the field, the corn from each plot is sacked and tagged. Field weights of ear corn are sometimes taken. The corn sacks may then he either piled in small piles in a shed until air dry, or they may be tied up on wires in a drying shed (Colorado method). The latter seems to allow the corn to dry out mere evenly and more quickly. Some stations now have elaborate drying equip- ment where the entire plot yield, can he dried to a moisture-free basis in a relative- ly short t ime , ( c ) For age . E xp er iiaen t g Forage plots for hay are almost always cut with a mower when l/'i-O-acre in size or larger. The plots may he trimmed evenly on the ends before the regular cutting time. The material is then raked and removed. Borders between plots are generally disregarded for large field plots. It is an advantage to be able to start on one side of the field and. mow through all the series, thus lessening the number of turns. A man should follow the mower with a fork to be sure that hay is not carried through the alley from one plot to the next. After the hay has been dried sufficient- ly, a side -de livery rake may be used to put it in windows, after which it may be bunched by a dump-rake or by hand. A convenient method to handle; the hay from each plot is to put it on a wagon or truck on which a sling has been placed. The load is then weighed, the net weight determined, and the hay unloaded, from the truck by a cable stacker. A small composite sample may he taken to dry to an air-dry basis, or it may he ground for an immediate moisture determination. For plots l/k-O-acre in size or smaller, hay nay he weighed conveniently by a portable platform scales on which a rack is set. For plots away from the central experiment station, a tripod and. spring balance affords a good method, to weigh forage plots. A large piece of canvas is equipped with snaps so that, when the hay is put on it. the sides can be gathered in and snapped to a ring. It Is then readily hung to the scale. (d) Sug ar Beet T rials In sugar beet yield trials, 4 -row plots are generally used with the two center rows harvested for yield. Except In studies on stand, and. certain other in- stances mentioned previously, the plots are commonly harvested on the basis of com- petitive beets, i.e., plants surrounded by plants on all. sides. The tops of the 2k<? other "beets (non-eompetitives) may be chopped off with a hoe "before harvest. The roots are then pulled with a standard "beet puller. It Is common practice to pull one replication at a time. The roots without tops are usually weighed in order to have this component for total plot yield in case this seems to be needed later. The non-competitives are then discarded. Two 20-root samples may be taken from the non- competitive beets as a sugar sample. The competitive beets are then pulled, topped, and weighed for each plot. The tare is then subtracted from the field weight of the roots. When a washer is not available, the tare may be taken in the field. The sample for tare is first weighed, the roots cleaned with steel brushes, and re- weighed. The difference in weight is the tare. It is believed desirable to calcu- late the tare for each plot separately. XVII. Harvesting Small Grain Nursery Plots Competent and continual supervision is necessary in the small grain nursery at har- vest time. Some investigators clean-cultivate the alleys between series. Under such conditions the rod rows are generally trimmed down to remove border effect. In wheat, for instance, the crop is planted in 18-foot rows, one foot being trimmed from each end of the plot. A string may be stretched across the series at both ends to designate the discard area to be cut, or a 16-foot bamboo pole may be used on each center row (in 3-row plots) so that tho wheat may be cut on both ends of the pole. Other investigators plant the alleys to some readily distinguishable variety, thus eliminating the border effect on the ends of the rod rows . The alleys are then removed before harvest. Hand sickles are used to cut nursery plots. The smooth- edged sickle is most widely used, but a saw-toothed sickle is satisfactory when new. Where straw yields are taken, grass shears may be used to assure an even cut. Kemp (1935) has constructed a rod-row harvester of the rotary shear type with which 2 men may cut 1500 rows per day. The harvested bundles are tied with binder twine, usually in one place. Strings should be tied with a simple, secure knot. The plot stake may be tied into the bundle or a tag attached to the string with the plot num- ber on it. The bundles may be tied on a table. Men who tie bundles should tape, their fingers. Seed plots are often sacked with large paper sacks tied" over the heads to prevent mixtures. By the aid of a large funnel, 25 pound manila bags are easily placed over the heads. Sacked bundles should be put under cover as soon as possible to protect them from rain. Small grain bundles may be either shocked in the field until they are ready to thresh, or hauled to a shed and hung up to dry. A drying shed may have wires about four feet apart stretched from one end to the other at sufficient height so that the bundles can be tied to the wire with heads down. The bundles should be hung fairly wide apart when they are harvested a little green. This is particularly true for oats. XVIII. Harvest of Corn .Breeding Material Inbred and hybrid strains of corn, which are the result of hand pollination, are usually harvested after maturity. Individual ears may be collected in the bag over the ear shoot, and all sacks from the same row tied together with binder twine, the tying being done with an ordinary sack needle. These sacks are then hung to wires in the drying shed and allowed to remain there until air -dry. This method has. proved, very satisfactory at the Colorado station. D — Threshing and Storage XIX. Methods of Threshing Field and Increase Plots Small grain field and increase plots are commonly threshed with the standard grain separator. Kiesselbach (I928) has found it necessary to make miner modifications to adapt them for this purpose. He lists these changes. as follows:. (1) JRemoval of the ?h6 grain elevator; (2) Elimination of the self-feeder; (3) providing a hinged door at the foot of the tailings elevator for cleaning out "between plots; (k) replacing the grain auger with a shaker-trough device; (5) removal of the grain-saving auger in the "blower , where" one exists; (6) equipment with a high-pressure air pump and tank to supply air pressure through a hose' to dislodge grains when the machine is cleaned out between plots; (7) cutting several holes, with covers, into the sides of the separator at convenient places to observe the interior and to introduce air pressure to clean out the separator. Such modifications make ' it easier to clean out the machine "between plots, thus reducing the chances for mixtures. The chances for mix- ture may he reduced further by threshing all plots of the same variety in succession. Seed can be saved from the last plot of the variety to be threshed. It is important to operate the machine uniformly throughout each experiment. The grain per plot is often weighed on a platform scale at the separator. XX . Threshing Hursery Plot s Small grains 3.n yield trials are generally threshed with small nursery threshers, while genetic material is usually threshed by hand. (a) Kursery Threshers Several machines that can be cleaned readily have been devised to thresh small nursery plots. According to Hayes and-Garber (1927) "the chief requisites of a machine to be used for experimental purposes are as follows: It should be easily cleanable and, in sc far as possible, tkers should be no ledges or ridges upon which seeds may lodge. The alternate threshing of different nursery crops is a desirable procedure. Each of the plots of one strain of wheat may be threshed separately in rotation and then a strain of oats may be threshed in the same way. At the Minnesota Experiment Station winter wheat is threshed alternately with barley, and spring wheat with oats. This plan helps materially to reduce the roguing of accidental mixtures from the plots." The Cornell machine designed by H. W. Teeter is very satisfactory for multiple-row plots, while the Kansas machine is widely" Used for rod rows. The Cornell machine has a shaker, screen, and fan. Its most serious drawback is the ■ difficulty in cleaning it between varieties, however, it can be cleaned more readily than the Kansas "machine. Recently, Vogel and Johnson (1-93*0 have developed a new type of rod-row thresher which is a combination of an overshot cylinder and modified screenless shaker and fan of an ordinary fanning mill. The grain is further cleaned by a separate re-cleaner.. It has been found satisfactory for small grains, peas, flax, and some grasses. Grain weights are taken -after threshing, usually in grams for rod-row plots. (b) Hand Threshi ng In genetic material where it is desired to thresh single plants, threshing is usually done by hand. A threshing board three feet square is useful for this pur- pose. The frame can be made of 1 x 2 -inch material over which a canvas is stretched tightly. Two blocks, about k x 6 inches in size, are then made and covered on both sides with corrugated rubber. These work very well for threshing wheat and other- naked grains. Eor barley, it has been found, at the Colorado station that the heads thresh out better when rolled up in a small canvas cloth (about 9 inches square) and rubbed. A piece of tin bent to form a fan can be used to blow the chaff out of the grain ^ striking it on- the canvas. Coffman (1935) was able to thresh 100 to V)0 single oat panicles per hour by the use of a light weight close-fitting leather glove on the right hand. The s pikelets aire stripped into a grain jpan where the chaff is easily blown out. XXI . Me thods for Shelling C orn After corn has reached an air-dry condition it is ready to shell for final .determina- tions. Genetic material is usually shelled by hand, altho some workers use an 2h-j enclosed single ear sheller. An ordinary corn sbeller is very satisfactory for yield trials. It should "be enclosed so that the kernels are not scattered when the ears are shelled. The air -dry weight of ear corn -should first be taken for the corn from each plot. A platform scale is often used for such weights. It should be balanced frequently to keep it in adjustment. The corn is then shelled, the cobs > being looked over minutely to be sure that all kernels have been recovered. The shelled corn is then weighed and recorded. A 500-gram shrinkage sample is taken at the Nebraska station and oven-dried to a constant weight. The yield of moisture-free corn is calculated from the percentage of oven -dry corn in the shrinkage sample. At the Colorado station, bushel weight is taken with the standard bushel weight tester, since bushel weight has been found to be an index of maturity. Moisture determina- tions are made with the Tag-Ecppenstall moisture meter, one sample per plot yield. XXII. Storage of Seed of Experimental Crops There are probably as many methods for seed storage of experimental crops as there are experiment stations. The first requisite is a place safe from mice and insects. Cabinets with metal drawers probably afford the best storage. It is usually neces- sary to fumigate once or twice per year where grain weevils and other insects are troublesome. For small seed lots a crystalline compound known as "Antimot" will effectively control insects. Small grain seed is usually kept in cloth bags, es- pecially seed saved from rod-row tests. Genetic material is commonly stored in coin envelopes. Seed corn for variety or yield tests may be stored in large bins. Gene- tic and breeding material may be kept either in cloth bags or in envelopes. At the Nebraska station inbred and hybrid seed corn supplies are kept in large clip envelopes (6x9 inches in size). These are filed in drawers in serial order. A similar plan :ls followed at Minnesota. Referenc es 1. Bonnett, R. K., and Burkett, F. L. Rate of Seeding -- A factor in Variety Tests. Jour. Am. Soc. Agron., 15:l6l-171. 1923. 2. Coffman, F. A. A Simple Method of Threshing Single Oat Panicles. Jour. Am. Soc. Agron., 27:1+98. 1935- 3. Engledow. F. L., and Yule, G. U. The Principles and Practices of Yield Trials. Empire Cotton Growing Corp. 1926. ' it-. Hayes, H. K., and Garber, R.J. Breeding Crop Plants, McGraw-Hill, pp. 132-100. 1927. 5. Jodon, N. E. Modifications in the Columbia Drill for Seeding Oats and Barley. Jour. Am. Soc. Agron., 2l+:328. 1932 . 6. Kemp, H. J. Mechanical Aids to Crop Experiments. Sci. Agr., 15:^88-506. 1935. 7. Kiesselbach, T. A. The Mechanical Procedure of Field Experimentation. Jour. Am. Soc. Agron., 20:^33-^+2. 1928. .8. Love, H. H., and Craig, V. T. Methods used end Results Obtained in Cereal Inves- tigations at the Cornell Station,. Jour. Am. Soc. Agron., 10:1^5-137 • 1913. 9. Parker, W. H. The Methods Employed in Variety Tests by the National Institute ' of Agricultural Botany. Jour. Natl. Inst. Agr., Bot., Vol. 3? No. 1. 1931* 10. Standards for the Conduct and Interpretation of Field and Lysimeter Experiments. Jour. Am. Soc. Agron., 25:803-828. 1933. 11. Thome, C. E. Essentials of Successful Field Experimentation. Ohio Agr. Exp. Sta. Cir. 96. 1909. £40 12. Vogel, 0. A., and Johnson, A. J. A New Type of Nursery Thresher. Jour. Am. Soc. Agron., 26:629~6>'). 1934. 15. Wishart, J., and Sanders, H. G-. Principles and Practice of Field Experimentation. Emp. Cotton Growing Corp. pp. 70-75* an <i 35-100. 1935. I 1 !-. Woodward, E. W., and Tingey, D. C. Improved Modification in the Columbia Drill. Jour. Am. Soc. Agron., 23:231. 1933. 13. Yates, F.., and Watson, P. J. Observer's Bias in Sampling Observations on Wheat . Emp . Jour . Exp . Agr . , 2:1 7 i !--177 . 193*4. . 16. Zavitz, C. A. Care and. Management of Land used for Experiments with Farm Crops. Proc. Am. Soc. Agron., 4:122-126. 1912 . Questions for Discus a Ion 1. What precautions are necessary in a crop rotation scheme for experimental crops? 2. How should the seedbed he prepared for experimental crops? 3. Under what conditions should experimental seeds be treated for disease? 4. What is the centgouor method? Checkerboard method? Rod -row method? 5. How is corn generally planted for experimental purposes? Sugar beets? 6. Hew would you calibrate a drill? 7. Explain how you would lay-out, mark, and plant a wheat nursery.. Give all dimen- sions and processes. 8. Why are field observations important? What plant measurements and notes are generally taken on small grains? 9. What different methods can he used for making stand counts? 10. At what time would you consider wheat, oats, and barley in Head. ? Pipe? 11. How would you take lodging notes in small grains? Corn? 12. What precautions or advice should be given to your assistants when rogulng plots? 13. How would you harvest small grains in a test where the varieties differed widely in date of ripening? Why? 14. How are large field plots of small grains generally harvested? Corn yield tests? 13. Give the detailed steps for harvesting sugar beet plots for yield. 16. Describe a method for harvesting forage viola tests. 17. Explain in detail how you would harvest snail grain nursery plots. 18. What modifications on an ordinary grain separator are necessary to adapt it for threshing field plots to prevent mixtures? 19. What are the requisites for a small grain nursery thresher? 20. How would you hand -thresh barley heads? Wheat heads? Oat panicles? Problems 1. It is desired to plant wheat in rod row trials at the rate of 9C lbs. per acre, the rate used by farmers in the vicinity. The nursery rows are i8 feet long and 12 inches apart. Calculate the amount of seed to weigh out in grams for each row. 2. Suppose the yield from a 1 6-foot rod row of wheat is 2op grams. Calculate the yield per acre. 3. The weight of shelled corn harvested is 23 lbs. on a plot 20 hills long. (a) When the hills are 36 x 36 inches, calculate the yield per acre for air dry shelled corn, (b) Calculate the yields per acre on the basis of corn with 13-J ! per cent moisture when the original shelled corn contained 13-2 per cent moisture. 4. Make up a table of factors for the conversion of pounds shelled corn per plot to bushels of shelled corn per acre when 10 to 20 hills are Harvested. Suppose the hills to he spaced 36 x 36 inches. FIELD PLOT TECHNIQUE APPENDIX 251 Table 1. - Area Under the Normal Curve V t A t A t A t A .00 .50000 .ko .65542 . .80 .78815- 1.20 .88493 .01 .50399 .41 .65910 .81 .79103 1.21 .88686 .02 .50798 M .66276 .82 .79389 1.22 .88877 .03 .51197 M .666^0 .83 .79673 1.23 .89065-*- .Ok .51595+ .44 .67003 .84 .79955- 1.24 .89251 • 05 .5199^ A5 .67365- .85 .80234 1.25 .89435- .06 .52392 .k6 .6772!+ .86 .80511 1.26 .89617 .07 .52790 M .68082 .87 .80785+ 1.27 .89796 .08 .53188 .48 .681+39 .88 .81057 1.28 .89973 .09 .53586 M .68793 .39 .31327 1.29 .90148 .10 .53985 .50 .691I+6 .90 .81 59^ 1.30 .90320 .11 .5J+380 .51 .69497 . .91 .81859 1.31 . 90490 .12 .5V776 .52 .698I+7 .92 .82121 1.32 .90658 .13 .55172 .53 .70194 • 93 .82381 1.33 .90824 .14 .55567 .5^ .70540 .94 .82639 1.34 .90938 .15 .55962 .55 .70884 .95 .82894 1.35 .91149 .16 .56356 .56 .71226 . .96 .83147 I.36 .91309 .17 .56750 .57 .71566 .97 .83398 1.37 .91466 .18 .571^2 .58 .71901+ .98 .83646 1.38 .91621 .19 •57535- • 59 .7221+1 • 99 .83891 1.39 . 91774 .20 .57926 .60 .72575 1.00 .84135- 1.40 .91924 .21 .58317 .61 .72Q07 1.01 .84375+ 1.41 .92073 "-.22 .58706 .62 .73237 1.02 .84614 1.42 .92220 .23 .59095+ .63 .73565+ 1.03 .34850 1.43 .92364 .2k .59484 .61+ .73891 1.04 .35083 1.44 .92507 .25 .59871 .65 .71+215+ 1.05 .8531^ 1.45 .92647 .26 .60257 .66 •7*537 1.06 .85543 1.46 .92786 .27 .60642 .67 .7^857 1.07 .85769 1.47 .92922 .28 .61026 .68 .75175- 1.08 .35993 1.48 .93056 .29 .6l409 .69 .75^90 1.09 .86214 1.49 .93139 .50 .61791 .70 .75804 1.10 .86433 1.50 .93319 • 31 .62172 •71 .76115- 1.11 .86650 1.51 .93443 .32 .62552 .72 .76424 1.12 .86864 1 "^2 •93575+ • 33 .62930 .73 .76731 1.13 .87076 1.53 .93699 .3* .63307 .7* .77035+ lJ.ll .87286 1.54 .93322 • 35 .63683 .75 •77337 1.15 .87493 1.55 • 939^3 .36 .614-058 .76 .77637 1.16 .87698 1.56 . 94062 •37 .64431 .77 .77935+ 1.17 .87900 1.57 • 9-+179 .38 .64803 .78 . .78231 1.18 .88100 I.58 .94295- .39 .65173 •79 .78524 1.19 .88298 1.59 . 944o8 yd x. i ■1.60" 1.61 1.62 1 .by 1.64 1.65 1.66 1.6? 1.68 1.60 1.70 1.71 1.72 1.73 1 ,75 1 ■ 7° 1 .77 1 -78 1 1 .80 1 .81 1 .82 1 .83 1 .84 1 .85 1 .86 1 .87 1 .88 1 .89 1 .QO 1 .91 1 .92 1 • 93 1 • 94 1 .95 1 .96 1 .97 ■1 .98 1 .99 A .94520 .9^630 .9H738 .94843- Ta"ble 1. - Area Under the No rmal C urve Sy ( Cont.) t" " ~ 2 .kO " t- .95053 . 95154 .95254 .95352 .95449 .95544 .95637 .95728 .95819 .95907 .95994 .96164 .96246 .96327 .96407 . 964854- ,96562 .96638 .96712 ; ; 0/ PJl ,96856 .96926 .96995. .97062 .97123 .97193 .97257 .97320 rvvO-1 T'\?{]_ ,97441 ,97500 .97558 .976I5- .97071 p 00 2 01 2 02 2 03 2 04 2 05 2 06 2 07 n Od O ,00 O .10 2 .11 2 .12 ^ - -j C ,.\.j O "1 4 O .15 2 .16 2 .17 .18 2 . 19 2 .20 2 .21 2 0'~> O .23 .2.4 2 .25 ofi 2 .27 Q pp. *— O on . -y 2 .30 2 •31 2 .32 2 •53 .34 2 • 35 2 .36 2 .37 2 .38 P . 7 SQ ... A ^ ;97;^5V (73" ■9?T .97831 .97933 , 97982 . 98030 . 98077 , 98124 ,98169 ,98214 ,98257 . 983OO .98341 . 98382 . 98422 . 98461 .98500 .9855 7 .98574 .98610 .98645- . 98679 .98713 . 98746 y--> i 1 -' 08809 ,98840 , 98870 . 98899 ) 00 o 98956 98985 99010 99036 99061 ,99086 ,09111 90134 , 99153 2.41 2.42 2.43 2.44 2.45 2 .46 2.47 2.48 2.49 2.50 2.51 2.52 2.53 2.54 2 . o5 2.56 2.57 2.58 d • }y 2.60 2.61 2.62 2.63 2.64 ^.05 2.66 2.67 2.63 2.60 2.70 2.71 2.72 2.7'') 2.74 2.7:; 2.70 2.77 2.78 2.79 A .99180 .99202 .99224 .99245-», . 99266 .99286 .99505- ,00324 .09343 .99361 .99379 .99396 .99413 .99430 . 99446 .90461 .99477 .99492 .09506 .99520 ,99334 .99547 .99560 ■99373 . 99386 rM-ic-iQ ,99398 .99609 .99621 ,99632 3 .9 C ,99653 99664 , 99674 .99683 .99693 . 99702 ,99711 .99720 ,99728 ,90737 t A 2.30 .997H5- 2.81 •99732 2.82 .9976O 2.83 .99767 2.34 „99774 2.35 .90731 2.86 .99738 2.87 % q07Cv",. .99801 d . 09 .99307 2.00 .99813 2.91 .90619 2 . 92 .99825+ 2.93 . 99o3,„ 2.94 .99830 2.95 .99841 2 .06 .99846 O Qf7 ,99851 2.93 .99856 2.99 .S700.L 3.00 . 098G04 3.01 . 99869 3,02 .09674 3.03 .93873 3.04 ■.99882 3 . 05 .90886 7 >.o6 . 99889 3.07 .99893 3.03 .99897 3.09 .99900 3-10 .99903 3.ll .99907 3.12 .99910 3.13 .99913 3, 14 .99916 3.13 . '/99 lo 3.16 .99921 3.17 on.' .0)1 5.13 .99O26 3.10 >53 Table 1. - Area Under the Normal Curve ^(Cont . ) A A A A 3.20 .99951 3-40 .99966 3.60 .99934 3.3o .99993 3.21 .99931+ 3JM .99967 3.61 .99985- 3.81 .99995 3.22 .99956 3.42 .99969 3.62 .99985+ 3.32 .99993 5.23 .99938 3A3 .99970 5.63 .99936 3.83 . 99994 3.24 .99940 3.44 .99971 3.64 .99936 3.84' .99994 5.25 .99942 3-^5 .99972 5.65 .99987 3.35 . 99994 5.26 .99944 3.46 .99975 3.66 .99987 3.36 .99994 3.27 .99946 3.47 .99974 5.67 .99938 3.37 •99995- 3.28 .999^3 3.48 .99975- 3.68 .99988 3.38 .99995- 3.29 .99950 3.49 .99976 3.69 .99939 3.89 .99995+ 5.50 .99952 3.-50 .99977 3-70 .99989 3.90 •99995* 5.51 .99953 3.51 •99973 5.71 .99990 3.91 . •99995* 3.32 •99955+ 3.52 .99978 3.72 .99990 3.92 .99996 5.53 .99957 3.55 .99979 5-73 .99990 3.93 .99996 3.54 •99953 3.54 .99980 5.74 •99991 3.94 .99996 3.35 .99960 3-55 .99961 3.75 .99991 3-95 .99996 3.36 .99961 5.56 .99982 3.76 .99992 3.96 .99996 5.57 .99962 5.57 .99982 3.77 .99992 3-97 . 99996 3.38 .99964 5.58 .99983 3.78 .99992 3.93 •99997 5.59 .99965+ 5-59 •99984 3.79 .99993 3-99 4.00 4.50 . 99997 . ,99997 •99999 ^Table I wae taken from "Tables" by L. R. Salvosa, published in "Annals of Mathe- matical Statistics", May 1930. OK >4 © r-l _C0 -p <m O OO C— O LCN t— VO CO KN r-l VO KN lO ' ' CO rC> On -d Cv a' r-l co -d r-l CO KN li'\ . VO-d P- O to-vo CO -d r-l CO IT— KN LA O CO -d -i ° -d* In— OJ ro i lO\ ON VO ON KN-d" CS KN v ,0 IC\ O IO, KN KN OJ KN c\J o v.q LCN OJ CVI 1 • CJj KN - . 03 On Cvl vo CJ r-l CO KN rH v O o o OJ .-1 CJ rcN ON IOv t— j LfN r-l, O C\ll NN O OJ NO -1 rH O co ro co a j KN-d' o o IfN LO, KN r-! KNVO vO _d vC OJ K, o h CO \o CO kn in, 0-1 VO KNVO On CO r -I r-l L-- KN -d- H ir, ON CD O OVC> KN lA r-H VO OJ r-i ' 8 _d '-O lO.VQ CO kn vO ON On r-l ON CO VO CO m. kn ~t Ox K", no KN LCN CVI -d" CJ -d' CO KN OI KN CO KN OJ KN i 1 1 .-;- vo O r-\ IAVO ,d- ,-d VO VO O- ON kn. i>- IO -d -d r-l CO KN r-l 1.-- -■;! Q OJ CO r-i CO O KN CO f- -d- to l- KN rH OJ VO C oco IO o- 0.1 On ! i CO ON.d- ~t no CO cm vo 0\ ON r-l O- CO VO CO IA KN r- -I -d ON KN f> K'-vo KN IO, co -d- CO -d CJ -d Cvl i-o CO KN CD U cd ! & OJ r— 1 rH KN ON CO KN ITS -d O CO H VO -H CO _d" .d ON ON ,--1 ON -Tl- I.O t— o co t~~ OJ r-i p~ O^ KN io -d- i — ' a? On vo CO -d- On O 0! o c-- -d t— r.„. [■-... li'N_J- KNVO CO (;- OJ vo Kn LO, ■ C- - rH O r~i KN l/N r-l r-0 On r- - Oi -± ON O C- -d OJ -d- OnVO VO r-l CJ -d- O vo VO On OJ KN CO 05 1 ON-d tO KN r— VO KN KN .T On CO -d- -d o a co oj r ■- co co LO C. r-l rH K ",-rl- r-co .d" KN ..rj" O KN t -- CO -d [-•-O o o LO -d- IfN O CO LCN IN- O i— K, © -Pi crjl © CO r-H KNCO CJ ON LO, On ON r-A ON C\l VO .d- H _-T CO l-r-.vo KWO K'\ LfN KN LO, co -d- CO -d CO .-d O '-H o © © ! rn <m vo r— on ON KN kn on ia m oj co ia KN kn KN K, ON ON r-l 0-'. _d" H O, ON CO L-- VO r-i r-i ai NO LO in r— ON NO -d" C.n 1 — ' co r ■■- --:Y CO r-- on CO r-H rr.C- I.O-, KN KN VO L"- O KN.cO KN ii'N CO ON 01 KN KN l/N On l:-- O O K , (O. c;> oj CO CO rO_d- CO CM CO. vo CO ~d r-i O O O KN KN r-l ,-d CJ CVi vo CO OJ LO> LO C-- O O: ON LTN ro t- r- - vo CO- Kn. '■O VO CO vo -•r O KN.J- KNVO O OJ OJ KN r-l VO r-l CO CO vo O CO O CO ITS O-d KYvO oj t-- LO, ON ON H ON ON CO CM vo on r-l r ■' _:r CO KN O- KN vO KNVO KN U'N KN l/N K\ LO KN-d- © IT'S H ltn m CO CO OJ H r-i rr- ON CO KN On ON C3N KN i'N OJ IfN r-H CO -d- r- ! CO O KN CO VO _:d- CO C"' ."J C'-'N vo r-- fO, NO VO r-l OJ -d- CO' O r-l CVI n ~t- -d ON OJ OJ OJ VO LO ON ON r! ON ON CO CVI VO LO LCN rH r— : -d r- kn r~- K-.'O K-, LO, KN U"\ K'N LfN KN LCN j CJ CO knIitn fC\ rH O VO lo- rd r-H ON ON r-H ON CO VO CO -d- ON ON ON On ICNVO VO '0 r-l r-l VO irv c\i i — ! vo :p KN.4- _-r CO i ON O LO VO On :0 On KN — -i 'a K-; VO cr. co LTN CM KN VO r<\ ifN 3- On KN LO r-i -d" -d- [-- rn LO t l 1 O tCN IA O C r-H o o U' , r- 1 la co -t o ON O f- OJ -d- QJ r-i 0\ ..J: UN t"— 'O VO LO -d no VO OJ o vo Oj o CO N"N CO ON Q O CO l>- i i i 1 CO ON ON ON ON r-i ON On CJN H On On o KN vo co ,-H IO, KN . i ' H .-.:. on .--.r CQ -f < •..".' _.+■ rv — KN i - NOVO KN v 1 j i LO, O -d H r-i On tr\_-i k"n cv rv-i r-H r-l O r— co H VO VO ;v| CJN .--J- C'N L - On LO, LO CM CO VO KN 01 CM VO ~i LTN vO .d- \ O co vo •S\ ro I s -- KN VO o i i 1 L HlH CO to m. |rH O ! ■* CO' CO r : ON r-l KN l- r-l CJ vo * q L'"N rT, IfN Oi H l'"\ r— ! i — I IO. O .— I C.5N --t O rO -d On J- Cn _d C5N ! h co iA _-.-- LO, VO t-~ CO r-J ,-l r-l OJ KN i — ! i OJ snoa ut3Gxa zs. 'iJiSUIB ,-oj rac P 00.17 JO O'jC i3 o 'T 255 1 CO B rH -P rH ON rH ON O H CM CM rH ON O CO H ON H CO rH CO O f- rH CO KNrH CNVO OCO NO ITN CO -st O CO CO K\ OCO .4" ON J>- H O CO ON t— VO o O CO •O On o t*- o r- ■O CD NO ON its r- o t~ > CM CM OJ CM CM CM 0) CM OJ OJ CM OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ CO $ KN o r-t O O CO rH tTN NO ITN ON NO QJ t— o\ir\ 92 On CO -d co .3- rH ^O CO KN CO o C— K\ NO NO r—oj K>. rH t--GJ rH C— f- r--l On kn NO H CM |T\ CM CM OJ OJ H CM -H CM H OJ rH CM rH CM rH OJ H OJ rH OJ H CO H CM CM ITN KN KNJ- On On CM OJ -=tCO CM rH ON CO rH O tTN rH rH O rH CO rH On CO VO O CO tr\ o OCO KN ITN o t- 1 o o o. r— rovo ON NO NO OJ ON NO LfN CO ON IT\ OJ KN CM fO CM KN OJ KN CM KN OJ OJ CM OJ OJ OJ OJ OJ OJ OJ rH OJ rH OJ rH OJ KN Q ifN CO CO f- OJ ITS CC tfN J- t- rH O CO KN ITN r— IP CO o r- CO KN NO On U"\NO -d NO -d" UN KN-d KNKN KN KN OJ OJ Ol rH CM rH OJ O rH O rH ON iH ON OJ KN OJ KN %> +3 03 ID Si u c 8 ■d o o rH «H '-M O <D <D K, -3 ID P 00 O -d r- rH J- o NO O CM KN ON ON LP. CO OJ KN OJ KN OJ KN Ct KN ION ON ltn r- CM KN CO KN ITNVD 3 no j& m OJ KN OJ OJ KN OJ KN OJ KN (TN. CO KN O ITN -d -d CO <H KNJ- VONO KNKN OJ OJ -d CM KN KN CM OJ OJ ON KNOJ NO CM -d LTNvO CO J- CM -d u-- NO ON ON NO CM -d rH- KN rH O OJ -d ON OJ O- KN oj -d O NO ON ITN OJ -d NO ON O CO CM KN CM KN CM KN CO KN CO KN OJ M", (M KN OJ KN -d O r-oi OJ -=f CO-d OJ -d o o l>- rH OJ -st ^O rH NO o CM .st KN-d" NO ON o c- NO CO t-rH tfN CO LfN LfN tfN O- KN rH rH r— ifN r>- ltnno rH .d- I— ITN CO KN D— CM OJ -d OJ -d OJKN OJKN O! KN COKN OJ rT, CMKN CMKN Ol - rll> VO O- KNCO Ot— ON NO ON tfN o o ON ITN OJ -d C— KN CO -d- OJ -d KN OJ KN OJ KN OJ KN OJ KN fiT KN CO -d QNO oj oj r— oj CO CM r-oj CM -d OJ ^d OJ KN ON KN r- ON -d NO _d IfN CM NN rH O COd NO ON -d-d OJ o O NO ON CO !>- rH NO O NO ON NO ON NO ON x O CO ITN CO OJ KN NO CO J- -d !>- rH O- r-j KN LTN KN-d KN^J- CM -d- OJ OJ -d- OJ -d" C^i -d- C0 -d- KN _d NO ON OJ -d C>> O CO NO ON (OH O-d- t~ t- LTN CM KN ITS CO -d OJ OJ OJ H rH O rH O rH ON OCO O CO KNNO rH CO O t-- o t— KN ITN KN ITN KNIT\ KN LTN K" ( IfN ro ifN KN-d KN_d KN-d- KN-d KN-d CM -d OJ -d rH CO NO KNKN ON rH IfNrH CM KN L— LfN NO KN NO CM IfN. rH IP. O IfN ON ON IfN f- CO -d OJ -d CO -d- t- -d t>- CM -d ONOO CO -d- ONNO ONNO CM -=h OJ VO O rH CO f~ fr KN -d VO -^- M2 KN ITN ■ ih LfN KNNO KNNO KNNO KNNO K^.NO KNlfN KNlTN KNLfN KN LTN KN LfN KN IfN KN ITS O NO _d CO NO CO IfN NO ON KN LTN O -d LfN ^t ^t KN U'N rHOO CO CO LTN O OJ OJ O-d CO CO NO CM OJ CO -d t-- CM CM d-OJ KN rH KN rH KNO KN ON CO CO OJ t- OJ t— -d 00 ■ -d- no -dCO -d-cO -d-CO dOO _dCO dCO -^-f- -d"t>- -* r*- -dC- 4 N LfN. NO CO ON O rH ■ OJ KN -d LfN NO rH rH 00 / OJ O.I C.\ CM CM CO aa-sriDS u-ssra .is^T'sras jcoj raopasjcj jo S88jS©g 2% i ra <d -p rH qH 03 O CM rH m p- o r— CO Ka j- vo o r- LAVQ _4 LP O IP- CM O -4 UP o r- o -4 KA OJ o tA- H -4" CM O o r-- -4 o rH O. o vo CO CD o tr- ovo o o o vo O VO .4 co Oa -4 Oa vo O CO CT\ r-P 3a VO CO (P OAVO .4 vo CO CM CaVO CM CM OJ CM CM CM OJ Cvl cm e,< CM O.I OJ Ol OJ OJ OJ Ol OJ OJ -I CM rH CM 8 c-~ o VO rH LPVQ VO O -4 VO KA o CM rH VO o t>- o LP oa m crj CO in .4- O- .-4 CO -4 VO ca a KAVO IP NP rr N LP OJ Pn KA_4 CO ION KA.4 CO KA CM 4" rH OJ H OJ H CM rH OJ rH H r-H rH ,H rH rH rH rH rH r*"\ rH r-l H H H H r-l .4 OJ KA ir\ OA ION rH OJ rH CM On LP rH CM § < ) oa H -4 CM OM>- co_± r-l OJ to r>- 00 to r-l CM OA OA r>- CM rH CM VO t<"\ IP- CM r-i 0J ,4 a:- fw ,..J rH CM O CM C-- rH ,H 0J VO o rH CM LP. f-P so o rH OJ ~h O VO O rH CM KAGO VO Ov r-i H o Pi OJ KA KA rH OA CM OJ OJ O rH OA OJ OJ O IA- rH CO OJ CM OA.4 CO 01 OJ -J- -4 O [— OJ OJ O VO O VO OJ CM OAVO rH Ol LP VQ 0\ LP rH OJ oj c:> OA LP H 0J 0A IP CO -4 CO CM CO -4 rH CW VO CA CO rP rH Ol LP t>- 00 KA rH CM <!•} £ co O VO KAOJ DA KA OJ OJ CO OJ Q CM I s -- C~- Ol H CM P- OJ o CO Q\ r-{ OA lf\ .-4 H OA KA OA rH CO O OJ rH CO r-cO O !>■ VO -4 .4 CVJ O ^-- TP 0A o VO 5h cm ka OJ KA OJ re. OJ KA OJ KA a oj 0J CM CM CM CM CM OJ OJ 0J CM 0J OJ CM CVI -p Co a> o tw, o © <h VO VO vo .4 ir\ OJ KA ,4 KA -4 LP Cvl KA KA -4 OJ O KA CM C~ _=J- -4 OJ IP KA ip OJ KA ,4 Oa KA CM CM KA i -I KA KA OJ CM KA ON Q\ CM .H ' KA IP OJ OJ r-l CvJ f<A IP f- CM O CM K^ rH -4- oj a oj rP O r-H c>J CO 0J KA 0\ cp, r-l CA CM Ol LP p-co LP t>- VO LP LP [-- TrS KA t~ - en O CD OA 4' LP LP r-l _rj- LP OJ IP. .4- -4 O H C--.4 ^^^ rp LP, 0'.. rP 'CM i-pvo KA 0J CM N~v tP CVI O H KA CO Cm O CM KA OJ KA CM KA oj ka CM KA OJ KA CM KA CM KA OJ K"N Ol KA OJ f-P 0J IP- CVI KA ce CD CD rH 'C{_4 i TCA rH f- rH rH D— IP- O C — H_ Oa c^ ! IP- O vo o ,4 r-H VO ov r-l rp vo co co r- vo OJ LP IP- CM LP O O LP VO LP VO 0AVQ -4 IP P- KA ,4 LfA VO rH .4 LP CM -4 CM -4 0J xj- OJ ~T oj rp OJ ^P Cvl ka OJ r-P cu fP CM KA CM rP KA VO O LP ^- KA4- OJ r-l IP- O -4 rH r-J LP OA O VO KA -4 [-- OJ -=t Q-AO OA LP OA LP OA LP CO -4 CO KA CO 0J IP- 0J C— r-l C-- O t— O M KA !>• O OJ K> O CO r- OA CM OJ OJ .::!- a 1 4- M -4" 0J -4 CM -=!- OJ -4 CJ ;.p On -4 LP KA OJ OJ OA VO P- KA KA-4 KA-4 KA-M- KA KA Ol 0J OJ CM KA CO VO i — i o KA LP CM -4 LP CO rH OA OJ KA OJ r-H CA OJ CJ jf CO CJ KA O ^ OA CM rH CO O CO KA rHCO Ol v -o KA LP O J" OJ vo KA IX CO O r-l VO KA IP L v -VO rH LP rP LP 'AJ 0J KA U CO H O rP vo rP O oi ;<A f— K\4- OCO O O KA . 4 KA.4 KA-4 CO r-\ VO VO OA O OA a> KA-it LP OJ 4" O OA OA OA 0\ 4- C- -4 1^ -4 IP- CO CM OA CM 4 C— o KA ,4- P LP KA ..T f^- f- -4 t-- [- KA C" KAVO KA VO O 4- LP .4- O LP O VO o c co o OA OJceriDs UTsem ^oxiwts a:oj xaopsQ.xj jo sqsjS^ci CVJ vo o o 257 m © d rH > P Cm o 1.979 2.616 VO ON t^- o ONVO rH OJ OJ rH &v8 rH OJ CO OJ VO ON C> ITN rH OJ VOCO VOCO ON tfN rH OJ ITNVO VOCO on ir\ rH OJ OJ rH VO CO CN LTN rH OJ OVO vo t— ONtfN rH OJ 8 OJ KN OJ KN OJ KN ON CO rH OJ IT\ OJ rH OJ NNOS rH rH rH V.0 rH rH CO rH O rH O O O O rH rH rH rH rH rH rH iH rH rH rH H rH rH rH (H CVJ VO ON rH rH ON OJ ITN ON H rH C— CO ITN CO rH rH ITN CO H H IfNCO rH iH -=f KN LTN CO rH rH KN r-» ITN CO rH <H OJCO ITN C— rH rH © u d o OJ KN K\ CO K\ rH OJ OJ rH CO K*N rH OJ OCO CO OJ <H OJ ONJ- f- oj rH OJ CO KN C— CO rH OJ F-C0 C--OJ rH OJ VO O r-oj r-t OJ IfN CO t>- rH H OJ I CO rH VO O VO O K> O VO CO O ONVO ON ITN VO VO ON ITS VO IfN On ITN IfN KN ON ir\ -3" rH On ITN © P oa OJ OJ OJ OJ rH OJ H OJ rH OJ rH CO rH OJ rH OJ u O Cm ^O t>- ITN rH ON OJ CVI VO OJ rH On OJ OJ J* ON rH CO OJ OJ KNVO ■H CO OJ Oi rH CO OJ O! H -d- H CO OJ 0) O QJ rH CO OJ OJ oco CO OJ 8 © © Cm Cm O tf> OJ rH OJ KN f--4- OJ rH OJ KN VO rH OJ rH OJ KN IfN CO' OJ O OJ rf> \st vo OJ o OJ KN tor OJ o Oj KN OJ-£ OJ o OJ KN rH OJ OJ O OJ KN © © © P -=J- OJ kn KN iT\ OJ KN OJ rH OJ KN rH CO -dr KN OJ KN O C— J- to, OJ KN ONVO . KN KN. OJ KN CO -d" KN KN OJ KN t— OJ KNKS OJ KN ro co -=t- vo a OJ KN VO rH VO O OJ KN moo voce OJ KN -d- LTN VO 00 OJ KN KNOJ VO CO OJ KN OJ OJ VO CO OJ KN rH O 'JO CO OJ KN o co VO C- OJ K> CU f-00 VO LTN • o r>- KN-=f- -4" r-t O f- KN.M- knco O VO to, j- OJ VO O VO KNJ- rH LTN O VO O KN O VO KN-d- ONO ONVO OJ ^h rH OJ j- ON CO O rH ON CO ONVO CO C- co rr~ VO O CO f~ VO ON CO VO trwo CO VO CO VO KNVO KNVO KNVO KNVO K"\vO KNVO KNVO KNVO ITN OJ rH O ITN rH o o OJ o o KN o 2 O o . IfN O O o H 8 sjBubs jxbqvi aaxi'Eras j.oj. raopeojj jc > ssoJtSaCE CD d mm q M d 0) & © m n •H M Tm d •H * rM < d K Cm o <m O 03 •H H CQ \> rH ■d d Pi B3 5 tr> ** rH c: •H CO © ?-l H o & o ffl © P -d a) a d •H C/J N rs rd a • S o -p rO <H O d © 00 rd (!) m 3 •H <~i r-l cd •g d. • © d XI P o -p OJ •H & M •H s o P<4 ■H u s *H M » © M C P4 o C/J O M 00 © © •H <vi r^ ^! © in a O d CO 13: p • Jd > r^ o J-i © d rM r^ © © CD d >s © rQ « rd rd M o (!) o d pH cm rH •H o & m fn a t: ft o C^ © O ^3 P iH ra <l> rri S > S t — i © d r r-t o © rQ •H o ffl P {H 4i CD s •H •n CO P M •H d a rCl p > 6" CO o KN I 258 j M LP O rH t^-VO CM LP O VO ON mru t— co rH p~ on vo o NO OJ Kn CM O CO -i- OVO CU LP IN- CO r-l CO O ON LfN r-l VO CM r-l CO J- C-~ CD O O OP NO C— CM v -0 H LP CO ..-J- Cp rH IP CJ OVCOQ4 CM KNCO CO QJ rPGO KNCO H ^t vo 1>-C0 ON On CM vo CP KNVO On Ol LP CO < 1 i . VO Ch rl IO LPVO O0 O H KN rH H rH H rH CM OJ OJ J- V -Q [-• 0\ O CM to, _-]- v.O t— OJ OJ OJ OJ KN K\ rP KN K*\ hP CO O rH OJ -"J" LP VO CO ON O K\-=f -4 _M- .4 ,:t ^t ^J ■ —H/ LP i CM |o W4 t~C0 CO KN OJ CO ON r-i H fVl KNVO CO KN OJ VO [--VO _=f CO CO VO K\ O NO r-l VO r H CO _d" OJ hP On KN UN NO f- O r-l ICN t>- r--- IP ^P CJN..-J- CO CM VO O -I" CO) CM ^-O ON I-PVO Cj KN CJN CO O VO VO O CDs KN CM -4- LPVO t-~ VO !f\4 rH ONVO hPVO s , OJ unco H r4" VO On LP r- ON , H Ks LPVO CO ON r-l r~l r-l H H r-l rH CO OJ --t LfN NO CO ON O OJ IP--J- CM CM OJ OJ.OJ OJ KN rP TCN \C\ vo r-co o h oj -t mvo i- Kn K", KN^-J- -J- -d- rd" -d- -M- _-h LP o rH r-i lp co o o.i r- r— on r— -J- On r-l CO l>- ONVO O H O co ^ ao _=t o lp o ir\ On kn LP VO OJ LP NO VO t- ON J- CJ C-- OJ VO CO ON 0\CO VO -=f- rH VO O fPVO ON O.I ITNCO rH ^i" H -t OJ LP CM UN KN t— C~ KN C- CM O- r-l IP. CO H KN LP, l>- VO ON r-l -rj- VO CO rH MP LP P- KN LP, r.— ON r-l CM -d LP VO CO H r-l H H rH rH on p-i oi m^tvo r— co o rH r I CM OJ CM O' OJ CO Oi K-, rp Ol KNLPVO i-CO O H CM rP H\ KN K\ KN KN KN _d" .r.i- ^t _d" i " 1 MD ITN ri ON NO LP t— OJ -d" C— O O IP c— KN-d- ,-l VO CO CO t— VO OJ !>■ OJ VO O lAVO ON CO -:t NO O- ON O OJ fCN _d- LP rH ,-l rH rH r-l IP ON OJ -vj- P- OJ ON 0\ -ci- CM L— -tJ- rH VO O _-t VO CO O r-l C I LP, 'CO O rP LTN 'c~ C'\ CM -4" C--C0 ON H CM rP_-t IP. t^-CO rH ,-H rH OJ OJ OJ CM CM CM CM Lf N KN r— VO CM KN r-i VQ f— VO 1 r\ r-i O ON CO VO ,-t H OD LP, VO CO O r-l KN IP I- On O OJ ON CD CM KN„-t LPVO t— On O CM KN IP LP KN K\ KN K» KN.--± ! jCM J ON OJ ON ON CO KN O CM <M _d- H _"t CO CO LP O KN _d j- VO OJ NT) ON OJ LP CO O OJ -d r~\ OJ IP rH rH LP, Lp O O CO h"N rH CD LfN rH VO r-l VO O tP VO CO On r-{ rp_zr VD C— On C5 h r-i on rp Lf- un oj •;-•• cr, o C- O Ol ITN t— ON r-l 'M ICn LP ,-; r,p^!- LPVO f-a-iO -i CM 1 i H KN-d" ITN C— CO ON H 01 KN rH .H rH _"t CO LP CO -d" r! KN-d; VO rH I s - O VO C--V0 KNCO CM LP CO O _d NO CO O OJ KN LP VS .0 r— --J- UP NO CO ON O rH CM KN ITS r-l r-i rH ,-! .--: OJ CJ CM CM CO VO f—CO ON C H CM -d" U'NVO OJ Ol CM OJ KN KN KN KN KN KN IO \ rH ON CJ CM CO r-l r-i C\ i ON C\ rH ,-l AJ CM r-i rH O CO p- CO O r-i OJ K\„4- IPVO VO f— CO ON CO VO CO VO CP r-i r-l Q LP KN r-i ON t-^t rH ONVO KN CO ON O O r-H OJ TP KN-'-'J- LTN <+■■ r-l OJ KN-d" VO f-CO ON O H r-l rH Ol -"J" LPVO !>'CX) ON O r--l CM r-l r-l r-l rl r-i H ''-I CM CM CJ KN.4- \.C ID--C0 0"N O rH CM KN CM CM 0.1 CO CM CM KN K\ KN [O, o © CO E-i 1 IPVO NO i— r-| OD VO _-J- tCN CM LP CO VO 1CN LP J- „-J- _d/ ^J- _:+ ,d" KN KN KN KN KN KN KN KN KN rH r-l O ON OP CO CO CO CO C*- .-| „+ _^;- KN KN KN IP KN K\ K^ KN fP K\ KN I-P K , t-P rP N"\ KN t . . r-- p- c-- c— VO VO vO VO NO | KN KN KN KN KN K\ KN K, KN KN KN IP, KN KN rP N^, PP KN KN KN rH CM KN.d/ LPV0 t~C0 On O r-l CM l'P~i- LTNVO C— CO C7N —I rH r I r-l rH |~! r-l --I r-l r-i O rH OJ KN--J- LPVO t— CO ON OJ CM OJ OJ OJ CM c,j OJ O! CM H H H H CD H o CO KN_d ITN O CO H f- K\ r- -d" rH 0.\ ON O O 1 C— CM ON VO r-i t— .4 r-i o cO vo lp- kn oj CO .d- vp r-l rH ----I- rH CD OJ v O -d- KN. CM CM CM C\J IP.--J- U'NVO rH O s . CO i.'--0 0N..TJ- KNCO i CM .-! r-l rP IP- CO ON P IP- CO CO O O! -d' VO ON r -1 ^t IP- O r-l r-l O O'N.CO O- C--NO LfN IP H OJ KN KN-d" IPVO D— CO ON On O r-\ OS K,.0 IP VO rH H H .■( r-l r-! r-l D--C0 ON ON O r-l OJ X~>.~'r IP rH .H -H r-l Oi CM CM CM CM CM o ■ Ol -4- vo LP On ro, c o.i ,d O ON vo ,-j- o --j- -^ r- cm odcd t-- O .d" O VO iTiC <X> ICN' K\ r-l r-i H CM K\ fP.d- iTWO O'nP-^J- l>-C^'oi OJ 1--VOCO CO Q KNVO O LP CD ii'N r-l [-- O x CO vo .^r ro r4 O CO t- • iCs VO r-CC; ON O r-l OJ Ol KVd- r-l r-i ,-H r-l r' r-l 1 IP~-t t>- CJ CD O KNCO ITS-*! -rf r-l CO VO -d- CI O CO i>-V0 „:- KN rH O ON CO D-LP^J- tPJ u nvo f-CO CO On O rH OJ ip r-l r-l r-l r-l r-l r-i O.i OJ OJ OJ Q O, CO LTN rH _d; _d" O -d/ KN CO CO ip, r-l rH CO VO r-l O KN ON VO VO O CM lf\ O VO Ol CO -d" r-i CO -CO ^rj- CM O t- CM LP I r >. ri tP C— O -d" ON-d- r-l CO NO LO..M- lO KN O I-- LP KN cj CO VO „t CD H CO On KN OJ -d" OsCO On ^r -d- ^f ION L-- On rH KnvO 0-,j CM O CO vo _d- o 1 r-H On IP- LfN r-l H OJ CM KN-O -d" LPVO O-f-cO O. C) O r-l OJ r"l i— i rH i — I KN.-i- --r IP'-O L'--0'; CO ON O I r-H r-l rH r i H rH r-l r- i rH CJ | On • On rr\ k~n oj r-i ion up, r— re, 'j-\ o O co ip, r-l ..* ro.vo n-n OJ ~J- O H r<^. t— H 'vO r-i IN- IP ON LPVO 0^ r-l r-l OJ CM'O t— rH C— CM ON L--VCJ vo :-- ON rH LP LP CM CO LP, CM ON NO KN r-i O) j r-l CO H CO r-i ON r f CO CO KN On PO,d rH t> UP Ol O O LP K\ O CO VO KN rH On t—_d- •H r-i OJ CM tP !CN -d" UN iP'-O C— f-CO CTn O CD r-l r-l r-i CM KN KN-d IAV0 VO I- -CO r 1 r-l i — 1 r—l l — 1 i — i f — I rH r-\ ri CO On CO CM VO _+ O O ITS ON OJ -d" -4" OJ ,OI ON 05 ^l- CO OJ IP rPVO tP rCN ip O O .-I -d- t~- r-i LP Ci LTN O O, CO LP CO LP,* l/NVO t--f~ . O I'- VO VO <:£) rH LP C ' VO NO vo H l>- n'N On vo CM On i.cn oj LfN O KN OJ t— ON UN h-.d- VO r-i O On On O . C^ OJ KN l>- O ONVO OJ ONVO --'' r-l CO Lf> KN rH .H.OJ Ol (O fp^.1- J- UP lO-.vo t— ^-cO On ON O r-i rH CM ir-..r.t- ,M- U'NVO r-H r-i rH r-\ r-l ri ,-H r-1 ,- -! 1 On G'\ II pH UP r-l rH O O iP h-.4" CM ON vO CO CO o qj rH on ip t~- fAj- a; lp O O r^, CM LOmX) CM VO O IP KN r-l O- O ON CM CO IJTN i-cn CJ :rv ip- o v:j cm r-i o r->. knvo O U"N rH VO Ol CO -=!• O VO OJ t-- OJ VO VO _tr CO ON LPVO KN O'N-d- On lp cm On o~vo up lp CO IP .H CO LP r-i CO IP OJ On r-4 H OJ OJ KN IP .4" -d UP U'NVQ t~- t-CO CO On O O r-H Ol Oi K\-d" -d" H r-l r-l ,H H r-{ r-l ,-H 6 r-l CXJ K~N_-J- IPVO IN- CO C^N O rH r-i cm iP_-j- irsvo o-co on o r-l r-l r-i rl r-l r-i r-l r-i H CM, rH CM KNd LPVO [—CO CJN O OJ Ol CM CM CO Ol OJ Ol OJ KN -p o •r-l r* Q) t:J I o r^ I CM t _OJ r O •r-i ra D3 CD O o5 JLi LU M- Sh C3 r-i !m O CD •H r—l h o <D ,-H ;cj pQ •H EH f-: +3 O ijfl CC! +5 ca co tn r. r r-i o a r-i 'H Ph rt r-l a •ri r-l r-i 'd .'-.J o O '-;! ,-0 >rl CO CU- o CD CO CQ CD O ,rj r-l C r° + " rj CQ S -d S O ?-! rSn tj v.o o n o Q CO CO CO r-i rl +- cd o cj O «H •H +3 rH H-3 P! A) ra O ,c! ■H +3 CQ 0^ O r- f-l P <H-i r o ft DJ O -r- 3 , 5 ^ d 'C 1 • co rxi fc< CO ri ^ r d o r c ri !>S fH r-l r a co ■d ^ ri co 00 K( »h r^ r'H CO J-i -|5 • ri rQ rO rH " LP CO co H Ch ■H O ,-ri -P r-l H I) Ph rH 259 Table 3 TV. Neparian or Hyperbolic ] jogarithms 1 1 2 3 4 5 6 .7 89 ■ 12 3 4 5 6789 1.0 0.0000 0100 0198 0296 0392 0488 0583 0677 0770 0862 10 19 29 33 48 57 67 76 86 1.1 1.2 1.3 O.G953 0.1823 0.2624 1044- II33 1222 1906 1989 2070 2700 2776 2852 1310 1398 1484 2151 2231 2311 2927 3001 3075 1570 1655 1740 2390 2469 2546 3148 3221 3293 9 17 26 35 44 8 16 24 32 4o 7 15 22 30 37 52 61 70 78 48 56 64 72 44 52 59 67 1.4 1.5 1.6 0.3365 0.4055 0.4700 3436 3507 3577 4121 4l87 4253 4762 4824 4886 3646 3716 3784 4318 4383 4447 4947 5008 5068 3853 3920 3983 4511 4574 4637 5128 5188 5247 7 14 21 28 35 6 13 19 26 32 6 12 18 24 30 4l 48 55 62 59 45 52 58 36 42 48 55 1.7 1.8 1.9 0.5306 0.5878 0.6419 5365 5423 5481 5933 5988 6043 6471 6523 6575 5539 5596 5653 6098 6152 6206 6627 6678 6729 5710 5766 5822 6259 6313 6366 6780 6831 6881 6 11 17 23 29 5 11 16 22 27 5 10 15 20 26 34 4o 46 51 32 38 43 49 31 36 4l 46 2.0 0.6931 6981 7031 7080 7129 7178 7227 7275 7324 7372 5 10 15 20 24 29 34 39 44 2.1 2.2 2-3 0.7419 0.7885 0.8329 7467 7514 7561 7930 7975 8020 8372 84l6 8459 7608 7655 7701 8065 8109 8154 8502 8544 3587 77^7 7793 7839 8198 8242 8286 8629 8671 3713 5 9 14 19 23 4 9 13 1.3 22 4 9 13 17 21 28 33 37 42 27 31 36 4o 26 30 34 38 2.4 2.5 2.6 0.8755 O.9163 0.9555 8796 8838 8879 9203 9243 9282 9594 9632 9670 8920 8961 9002 9322 9361 9400 9703 9746 9783 9042 9083 9123 9^39 9478 9517 9821 9858 9895 4 8 12 16 20 4 8 12 16 20 4 8 11 15 19 24 29 33 37 24 27 31 35 23 26 30 34 2.7 2.8 2.9 0.9933 1.0296 1.06^7 9969 0006 0043 0332 0367 0403 0682 0716 0750 0080 0116 0152 0438 0473 0503 0784 0818 0852 0188 0225 0260 0543 0578 0613 0886 0919 0953 4 7 11 15 13 4 7 11 14 18 3 7 10 14 17 22 25 29 33 21 25 28 32 20 24 27 31 3.0 1.0986 1019 1053 1036 1119 1151 1184 1217 1249 1282 3 7 10 13 16 20 23 •dS 30 3-1 3-2 3.3 1.1314 1.1632 1.1939 1346 1378 1410 1663 1694 1725 1969 2000 2030 1442 1474 1506 1756 1787 1817 2060 2090 2119 1537 1569 1600 1848 1878 1909 2149 2179 2208 3 6 10 13 16 3 6 9 12 15 3 6 9 12 15 19 22 25 29 18 21 25 28 18 21 24 27 3A 3-5 3-6 1.2238 I.2528 1.2809 2267 2296 2326 2556 2585 2613 2837 2865 2892 2355 2384 2413 264l 2669 2698 2920 2947 2975 2442 2470 2499 2726 2654 2782 3002 3029 3056 3 6 9 12 15 3 6 8 11 14 3 5 8 11 14 17 20 23 26 17 20 22 25 16 19 22 25 3-7 3-3 3-9 I.3083 1.3350 I.3610 3110 3137 3164 3376 3403 3429 3635 3661 3686 3191 3218 3244 3455 3481 3507 3712 3737 3762 3271 3297 3324 3533 3553 3584 3788 3813 3833 3 5 8 11 13 3 5 8 10 13 35 3 10 13 16 19 21 24 16 18 21 23 15 18 20 23 4.o 1.3863 3888 3913 3938 3962 3987 4012 4036 4o6l 4o85 25 7 10 12 15 17 20 22 k.i k.2 4.3 1.4110 1.4351 1.4586 4134 4159 *H8^ 4375 4398 4422 4609 ^633 4656 4207 4231 4255 4446 4469 4493 4679 ^702 4725 4279 4303 4327 4516 454o 4563 4748 4770 4793 2 5 '7 10 12 2 5 7 9 12 2 5 7 9 12 14 17 19 22 14 16 19 '21 14 16 18 21 4.4 4.6 1.4816 1.5041 I.52&I 4839 ^861 4884 5063 5085 5107 5282 5304 5326 4907 4929 4951 5129 5151 5173 5347 5369 5390 i).'974 4996 5019 5195 5217 5239 5412 5433 5^54 2 57 9 11 2 4 7 9 11 2 4 6 9 11 14 16 18 20 13 15 18*20 13 15 17 19 260 4.8 5.0 5.1 5-2 5. 3 5.^ 5-5 5.6 5-7 5.8 5-9 6.0 6,1 6.2 6.5 6.14 6.5 6.6 6.7 6.8 6.9 o l . 5476 1.5686 l . 5892 . 609)+ 1.6292 1.6487 I.6677 1.6864 1.70^7 1.7228 1.7405 1.7579 1.7750 1.7918 I.8085 1.8245 1.8405 1 Table IV. Neparian or Hyp erb olic Logari thms! (Cont.) _ 3 5497 5518 5559 5707 5728 5748 5915 5935 5953 6ll4 6134 6154 6312 6332 6351 6506 6525 6544 6696 6715 6734 6882 6901 6919 7066 7084 7102 7246 7263 7281 7422 744o 7457 7596 7613 7630 7766 7785 7800 7934 7951 7967 8099 8262 8421 8116 8132 8278 8294 3437 8433 1.8565 8579 I.8718 I 87o3 1.8871 8886 S594 8610 8749 876^: 8901 8916 1.9021 1 9036 I.9169 9184 1.9315 '9356 7.0 7.1 7.2 7.3 7.4 7-5 7-6 7-7 7.8 7.9 3.0 8.1 8.2 8.3 1 n) 9459 1.9601 1.9741 1.9879 9051 9066 9199 9213 9^44 9339 9473 9488 950. 9615 9755 98Q2 9629 96!-: 3 9769 9782 9906 9920 2.0015 2.0149 2.0281 2.0412 2.0541 2.0669 2.0794 2.0919 2.1041 2. ].l63 0028 0162 0295 0042 0055 OI76 OI89 0308 0321 0425 0554 0681 0438 0451 0567 0580 0694 0707 4 5560 5581 5602 5769 5790 5810 5974 5994 5oi4 6174 6194 6214 6371 6390 6409 6563 6582 6601 6752 6771 6790 6938 6956 6974 7120 7133 7299 7317 7156 733 1 )- 7475 7492 7509 7647 7664 7681 7817 7834 7831 7984 3001 8017 8148 8165 8310 8326 8469 8485 8131 8342 8500 8625 8641 8779 8795 8931 3946 8656 8310 8961 9081 9095 9228 9242 9373 9387 9110 9237 q4o2 9516 9530 9544 9657 9671 9796 9810 9933 99^7 9685 9824 9961 0069 0082 0202 0215 0334 0347 0096 0229 0807 0819 0832 0931 1054 1175 0943 0956 1066 1078 1187 1199 0464 0477 0592 0605 0719 0732 C-490 0618 0744 0844 0857 0869 0968 0980 1090 1102 1211 1223 0992 1114 1235 7 8 9 5623 5644 5665 5831 5851 5872 6034 6054 6074 6233 6253 6273 6429 6448 6467 6620 6639 6658 6808 6827 6845 6993 7011 7029 7174 7192 7210 7352 7370 7337 12 3 2 4 6 2 4 6 I 4 6 4 6 2 4 6 2 4 6 2 4 6 2 4 2 4 2 4 7527 7544 7561 I 2 3 7699 7716 7733 2 3 7867 7884 7901 2 3 3034 8050 8066 8197 3213 3358 8516 3374 8532 8229 ! 2 8390 J 2 8347 2 1 3__5_ 3 5 3 5 3 5 3 11 13 3 10 12 8 10 12 8 10 12 8 10 12 8 10 11 7 9 11 7 9 11 7 9 11 7 9 11 7 9 10 7 9 10 7 3 10 7 8 10 7 A 9 15 17 19 14 16 19 14 16 18 14 16 18 14 16 1.3 13 15 17 13 15 17 13 15 16 13 14 16 12 14 16 12 14 16 12 14 13 12 13 15 12 13 15 6 3 10 ill 13 15 6 8 10 |li 13 14 6 8 9 11 13 14 8672 3825 8976 9125 9272 9416 8687 8840 8991 9l4o 9286 9430 9539 9373 8703 I 2 8856 I 2 9006 J 2 i ™ 9135 I 1 9301 J 1 9449 j 1 9587 i 1 i b 6 ! 6 r 3 k 3 .4 9699 9838 9974 0109 0242 9713 985I 9988 9727 9865 0001 3 4 -a h o i 6 !6 8 3 3 11 12 14 11 12 14 11 12 14 9 10 12 13 9 10 12 13 9 10 12 13 9 1 10 11 13 7 8 7 8 7 8 10 11 13 10 11 10 11 12 12 012.2 0255 0136 ! 1 0268 ! 1 4 I.5 >373 0386 0399 \ l 0503 0631 0757 0516 0643 0769 0528 i 1 0656 I 1 0782 I 1 08S2 0894 0906 1 3 '7 8 7 8 l 5 6 8 5 6 3 568 5 o 8 1005 1017 1029 112b 1133 1150 . 1247 1258 1270 j 1 2 4 15 6 7 2 4 j 5 6 7 2 4 ! 3 6 7 9 11 12 9 11 12 9 10 12 9 10 12 9 10 12 9 10 11 9 10 11 9 10 11 9 10 11 8 10 11 261 • Table 17. Neparian or Hyperbolic Logarit bms 1 (Cont.) 12 3 1+5 6 7 8 9 12 3 1+ 5 6 7 8 9 8.1+ 8.5 S.6 2.1282 2.11+01 2.1518 129"+ 1306 1313 11+12 ll+2l!- l'!-3o 1529 15^1 1552 1330 13^2 1353 I1365 1377 1369 11+1+8 11+59 1^71 j 11+33 ll+9'+ 1506 1561+ 1576 1587 3.599 1610 1622 1 2 k 12 4 12 3 6 7 6 7 6 7 8 6 O O 10 11 9 11 9 10 8.7 3.8 3.9 2.1633 2.171+8 2.1861 161+5 I656 1668 1759 1770 1782 1872 1833 189!+ 1679 1691 1702 1793 1801+ 1815 1905 1917 1923 1713 1725 1736 1827 1838 I3lf9 1939 1950 1961. 1 2 3 12 3 1 2 .3 5 5 k 6 7 6 7 6 7 8 8 8 9 10 9 10 10 9.0 2.1972 I983 199^ 2006 2017 2028 2059 1 2050 2061 2072. 1 2 3 1+ 6 7 p. 9 10 Q.l 9.2 9.3 2.2083 2 .-2192 2.2300 209I+ 2105 2116 2203 2211+ 2225 2311 2322 2332 t 2127 2138 23J+3J2159 2170 2181 2235 221+6 225712268 2279 22o9 23l(3 235I+ 236412375 2386 2396 12 3 1.2*3 12 3 1+ 1+ 1+ i 5 7 1 8 5 6 j 8 5 6 | 7 9 10 9 in 9 10 9.* 9-5 9.6 2. 21+07 2.2513 2.2618 21+18 21+28 2I+39 2523 253I+ 25V+ 2628 2638 261+9 21+50 21+60 2 1+71 2l+3l 21+92 2502 2555 2565 2?7S t ?536 2597 2607 2659 2670 2t.80 i 269c 2701 2711 12 3 12 3 1 2 3 1+ 1+ 1+ 5 6 i 7 '5 | 7 "5 67 10 O n y 3 9 9.7 9.8 9.9 2.2721 2.2821+ 2.2925 2732 27^2 2752 2762 2773 27&3J2795 2803 231 1+ 283I+ 281+1:- 285I+ 2865 2875 2385 2895 2905 2915 2935 29I+6 2956:2966 2976 2986 2996 5006 3016 ■ 1 . . -j 1 2 3 1 2 7 - 12 3 1+ k h 5 6 ! 7 5 6 i 7 5 67 8 8 9 -j Table of Neperian Logarithms of 10 +11 n 1 3 k | ■ , 6' 7 • 8 9 log e 10 n 2.3026 1^.6052 6.9078 1 9.2103 11. 5129 13.3155 16.1181 18.1+207 20.7233 Table of Neperian Logarithms of 10 -n 11 1 2 3 k 5 6 7 9 log e 10- n 3.6971+ 5.391+8 7.0922 IO.7897 12 .1+871 14 . 181+5 17 .8819 19-5793 21 ,276". ^"-This table is reproduced from "Four Figure Mathematical Tables" by the lace J. T Bottomley and published by Macmillan and Co., Ltd. ( London) . The consent of the publishers and representatives of the author have been obtained. (3520-39) 262 Table V. Values of percentages transformed into degrees of an angle. Angles of equal information are given in the "body of the table corresponding to observed per* contages along the left margin arid top. (hach angle ending in 5 is followed by a - or a - sign for guidance when the last decimal is dropped) . Table taken from article by Dr. Chester I. Bliss of the Institute for Plant Protec- tion; Moscow, Russia. Reproduced by permission of the author. / 0.00 . 01 0.02 . 03 0.Q4 0.03 0.06 0.0T .00 . 09 0.0 0.57 o.Si 0.99 1.15" 1.23 1.40 1 . 52 1.62 1 . 72 0.1 1.81 1.90 1.99 2.0T 2 . 14 2.22 2 .29 2 . 56 2.43 2.30 0.2 2.56 2.63 2.69 2.75- 2.8.1 2.87 2.02 2.98 3.03 3.09 0.3 3.1^ 3 . 19 3.24 3.29 3.2* 3.39 3.44 3.49 3.93 3.33 0.4 3.63 3.67 3.72 3.76 3.8c 5.83- 3.89 3-93 3.97 4.01 0.5 it-. 05+ k . 09 4,13 4 . 17 4.21 4.25+ 4 .20 4 . 35 4.3T 4.40 0.6 4,44 4.48 4 . 52 4 . 55+ 4.59 it. 62 4 . 66 4 . 69 ! +.73 4 . 76 o.T 4.80 4.83 4.87 4.90 4 . 95 4 . 9T s no 5.03 5.07 5 . 10 0.8 5.13 5.16 5.20 r~, ^i'/ 5 • 2 5.20 5.32 5.33+ 3. 53 5.41 0.9 5.44 5 AT 5 . 30 5.33 5 • 36 3.59 1- ,'Vo 5.634. -3 . 71 9-^-6 O.63 9.81 11 .0° 11.24 11.59 12.52 12.65 12.79 13.31 13-94 Ik, 06 15.00 15.12 15.23 lo.ll 1.6.22 le.32 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5.74 6.02 6.29 6.^- 6.80 7.04 7.27 T.4S 7.71 2 8.15 3.55 8.55 6.72 3.91 9.10 9.28 3 9.98 10.14 10.51 10. 47 10.65 10.78 10.94 4 11,54 11.68 11.85 11.97 12.11 12.25- 12.59 5 12.92 13.05+ 13.16 13.51 15. H 1.5.56 13.69 6 14.16 lU.30 14.42 14.54 14.6?+ 14. TT 14.39 T 15.34 15.45+ 15.56 13.68 15. '79 15,89 16.00 8 16.45 16.54 16.64 16.74 16.65-- 16.95+ 17.05+ 17.16 17.26 17.36 9 17.46 17.56 17.66 17.76 17. 8 C ^ 17.93+ 16.05- 18.15- 13.24 18.54 10 13. 44 18.55 I8.63 18.72 18.81 16. 91 19.00 19.09 I9.I9 1-9.28 11 19.37 19.46 19.55-1- 19.64 19.75 19.82 19.91 20.00 20.09 20.13 12 20.27 20.36 20..44 20.53 20,62 20/70 20.79 20.86 20.96 21.05- 15 21.13 21.22 21.30 21.39 21.47 21.56 21.64 21.72 21.31 21.89 14 21.97 22.06 22.11 22.22 22.30 22.33 22.46 22.35- 22„53 22.71 15 22 .79 22.67 22, 9e- 25,03 23.ll 23.19 23.26 23.36 23.42 23.50 lb 23.58 25.60 25.73 23.61 23.89 23.97 24.04 24.12 24.20 dh .27 1.7 24.35+ 24.43 24.50 24.58 24.65+ 24.73 24.80 24.33 24.05- 25.05 18 25.10 25.18 23.25+ 25.33 23.40 25.48 25.55- 25.62 25. TO 25.77 1.9 25.84 25.92 25.99 26.06 26.15 26.21 26.26 26.55- 26.42 26.49 20 26.3c 26.64 26.71 26.78 26.85+ 26.92 26.99 27.O0 21 27.28 27.30- 27.42 27,49 22 27,97 28. Oil- 28.11 26.18 23 23.66 26.73 28.79 28. 8e 24 29.35 29A0 29.47 29.55 27.56 27.65 2 r " 60 27. 76 28.25- 28.32 28.45 23.93 29.00 29.06 29.13 29.60 29.67 29. T5 29.80 27.15 27.20 27.65 28.32 27.00 P. en 29.20 29.87 29.27 29.95 263 Table V. Values of percentages transformed into degrees of an angle. Angles of equal information are given in the "body of the table corresponding to observed per- centages along the left margin and top. (Continued) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 25 30.00 30.07 30.13 30.20 30.26 30.33 30.4o 30.46 30.53 30.59 26 30.66 30.72 30.79 30.85+ 30.92 30.98 31.05- 31.11 31.18 31.24 27 31.31 31.37 31.44 31.50 31.56 31.63 31.69 31.76 51.82 31.88 28 31.95- 32.01 32.08 32.14 32.20 32.27 32.33 32.39 52.46 32.52 29 32.58 32.65- '32.71 32.77 32.83 32.90 32.96 .33.02 35.09 33.15- 30 33-21 33.27 33.3^ 33.40 33.46 33.52 33.58 33.65- 33.71 33.77 31 33.83 33.89 33.96 34.02 34.08 34.14 34.20 34.27 54.55 34.39 32 - 3kM- 34.51 3*. 57 34.63 34.70 34.76 34.82 34.88* 54.94 35.00 33 ■ 35.06 35.12 35.18 35.24 35.30 35.37 35.43 35.49 55.55- 35.61 34 35.67 35.73 35-79 35.85- 35.91 35.97. 36.03 36.09- 56.15+ 36.21 35 36.27 36.33 36.39 36.45+ 36.51 36.57 36.63 36.69 56.75+ 36.81 36 36.87 36.93 36.99 37.05- 37.11 37.17 37.23 37.29 37.35 37.41 37 57.47 37.52 37.58 37.64 37.70 37.76 37.82 37.88 37.94 38.OO 38 38.06 38.12 38.17 38.23 38.29 38.35* 38.41 38.47 38.55 38.59 39 38.65- 38.70 38.76 38.82 38.88 38.94 39.00 39.06 39.11 39.17 4o 39.23 39.29 39.35- 39.41 39.47 39.52 39.58 59.64 39.70 39.76 4l 39.82 39.87 39.93 39.99 40.05- 40.11 40. 16 40.22 4o.28 40.54 42 4o.4o 40.46 40.51 40.57 40.63 40.69 40.74 4o.8o 40.86 40.92 43 40.98 41.03 41.09 41.15- 41.21 41.27 41.32 41.38 41.44 41.50 44 41.55+ 4l.6l 41.67 41.73 41.78 41.84 41.90 41.96 42.02 42.07 45 42.15 42.19 42.25- 42.30 42.36 42 .42 42.48 42.53 42.59 42.65- 46 42.71 42.76 42.82 42.88 42.94 42.99 43.05- 43.ll 43.17 45.22 47 43.28 43.34 43.39 43.45+ 43.51 43.57 43.62 43.68 43.74 45.80 48 43.85+ 43.91 43.97 44.03 44.08 44.14 44.20 44.25+ 44.31 44.57 49 44.43 44.48 44.54 44.60 44.66 44.71 44.77 44.83 44.89 44.94 50 45.00 45.06 45.ll 45.17 45.23 45.29 45.34 45.40 45-46 45.52 51 45.57 45.63 45.69 45.75- 45.80 45.86 45.92 45.97 46.03 46.09 52 46.15- 46.20 46.26 46.32 46.38 46.43 46.49 46.55- 46.61 46.66 53 46.72 46.78 46.83 46.89 46.95+ 47 .01 47.06 47.12 47.18 47.24 54 47.29 47.35+ 47.41 47.47 47.52 47.58 47.64 47.70 47.75+ 47.81 55 47.87 47.93 47.98 48.04 48.10 48.16 48.22 48.27 48.55 48.39 56 48.45 48.50 48.56 48.62 48.68 48.73 48.79 48.85+ 48.91 48.97 57 49.02 49.08 49.14 49.20 49.26 49.31 '49.37 49.43 49.49 49.54 58 49.60 49.66 49.72 49.78 49.84 49.89 49.95+ 50.01 50.07 50.15 59 50.18 50.24' 50.30 50.36 50.42 50.48 50.53 50.59 50.65+ 50.71 60 50.77 50.83 50.89 50.94 51.00 51.06 51.12 51.18 51.24 51.50 61 51.35+ 51.41 51.47 51.53 51.59 51.65- 51.71- 51.77 51.85 51.88 62 51.94 52.00 52.06 52.12 52.18 52.24 52.30 52.56 52.42 52.48 63 52.53 52.59 52.65+ 52.71 52.77 52.83 52.89 52.95+ 53-01 53.07 64 53.13 53.19 53.25- 53.31 53.37 53.43 53.49 55.55- 53.61 53.67 2€h Table Y. Values of percentages transformed into degrees of an angle. Angles of equal information are given in the "body of the table corresponding to observed per cent ages along the left margin and top. (Continued) 0,0 0.1 0.2 0,5 o.k 0.5 0.6 0.7 0.8 0.9 63 53-73 53-79 53-35- 53.91 53-97 3+.03 5+.09 54.15+ 5^.21 ^+.27 66 5+-33 5^.39 54, 1*5+ -54,51 5 J +-57 54.63 54.70 5^.76 54.82 54.88 67 5 ] ^9^ 55.00 53.06 55.12 55-13 55. 24 35-30 35-37 55 A3 53 -+9 68 55-55+ 55.61 55.67 55.73 55.80 55.86 55-92 55.98 56. 04 56.ll 69 56.17 56.23 56.29 56.35+ 56. 1*2 56A3 56.34 56.60 56.66 56.73 70 56.79 56.83+ 56.91 56.98 57. 04 57-10 57.17 57.23 57.29 57,35+ 71 57 M 57.1*8 37.5^ 57-61 57.67 57-75 57.30 57.86 57.92 57-99 72 58.05+ 53.12 58.18 58.2)4- 58.31 58.37 58 ,1*1* 58.30 5Q.56 58.63 73 58.69 58.76 58.82 58.89 58.95+ 59.02 59.08 59.15- 59-21 59-28 74 59*3+ 59- +1 39.47 39. 5 J + 39.60 59.67 59. '~(k 59.80 59.87 39-93 75 60.00 60.07 60.13 '60.20 60.27 '60.33 60A0 60A7 60.53 60.60 76 60.67 60.73 60.80 60.87 60.9+ 6.1.00 61.07 61.1)4 61.21 61.27 77 61.3)1- 61.1*1 61. 1*8 61.55- 61.62 61.68 61.75+61.82 61.89 61.96 78 62.03 62.10 &2.17 62.21* 62.31 62.37 62.1*)+ 62.51 62.58 62.65+ 79 62.72 62.80 62.87 62.9)4 63.01 65.08 63.15- 63.22 63.29 63.36 80 63.1*1* 63.51 63.38 63.65+ 63.72 63.79 63.37 63.94 6)*. 01 61*. 08 81 64.16 64. 23 6'+. 30 64.38 64.45+ 64.52 64.60 6)+. 67 64.75- 64.82 82 64.90 bk. 97 65.05- 65.I.2 63.20 65.27 65.35- 65.1*2 65.50 65.57 83 65.65- 65.73 65,80 65. 83 65.96 66.03 66.11 66.19 66.27 66.3)* 81* 66.42 66.50 66.58 66.66 66.74 66. 81 66.89 66.97 67.05+ 67.13 85 67.21 67.29 67.37 67,45+ 67.5'^ 67.62 67.70 07.78 67.86 67.9+ 86 68.03 63.11 68.19 68.28 68.36 68.44 68,53 68,61 68.70 68.78 87 68,87 68.95+ 09.04 69.12 69.2.1 69.30 69.38 69,1*7 69.56 69.64 88 69.73 69.82 69.91 70.00 70.09 70.18 70.27 70.36 70.45 70.54 89 70.63 70.72 70.81 70.91 71.00 71.09 71.19 71.28 71.37 71.47 90 71.53 71.66 71.76 71.85+ 71.95+ 72.03- 72.15- 72.2)+ - 72.31* 72.1*1* 91 72.54 72.61* 72.74 72.84 72.95- 73.03- 75.15+ 73.26 73o6 73,1*6 92 73.57 73-68 73.78 73.89 74.OO 74.11 7 J *.2l 74.32 'jkM 7 ] '-.55- 93 74.66 7 ) *-77 7^-88 75.00 75 .11 75.23 75-35- 7:3.1+6 75.53 73-70 94. 75.82 75.9)* 76.06 76.19 76.31 76.1*1* 76.56 76.69 76.32 76.95- 95 77.08 77.21 77.31* 77,1*8 77.61 77.75+ 77.89 78.03 78.17 73.32 96 73.1*6 78.61 78.76 78.91 79.06 79.22 79.37 79-55 79.69 79.36 97 80.02 30.19 80.37 80.5)4 30.72 80.90 81.09 81.23 81J+7 81.67 98 81.87 82.08 82.29 82.51 82.73 82.96 83.20 83.1*5+ 83.71 85.98 265 Table V. Values of percentages transformed into degrees of an angle. Angles of equal information are given in the "body of the table corresponding to observed per- centages along the left margin and top. (Continued) 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 99-0 99.1 99-2 99-5 99-4 84.26 84.56 84.87 85.20 85.56 84.29 84.59 84.90 85.24 85.60 84.32 84.62 84.93 85.27 85.63 84.35- 84.65- 84.97 85.31 85.67 84.38 84.68 85.00 85.34 85.71 84.41 84.71 85.05 85.58 85.75- 8U.44 84.74 85.07 85. 41 85.79 84. U7 84.77 85.IO 85.45- 85.85 84.50 84.80 85.13 85.48 85.87 84.55 84.84 85.17 35.52 85.91 99-5 99.6 99-7 99.8 99-9 100.0 85.95- 86.57 86.86 87.44 88.19 90.00 85.99 86.42 86. 91 87.50 88.28 86.03 86.47 86.97 87.57 88.38 86.07 86.51 87.02 87.64 88.48 86.11 86.56 87.08 87.71 88.60 86.15 86.61 87.15 87.78 88.72 86.20 36.66 87.19 87.86 88.85+ 86.24 86.71 87.25+ 87.95 89.01 86.28 86.76 87.31 88.01 89.19 86.35 36.81 87.57 38.10 89.43 (507-58) 2 05 rH CO) I CD o CM s ~ o B q 3 ri ■H fVJ ' o tA * i H rH !> OJ co H 1-1 O r3 fl (0 i~l (H R w H H H H r> X H b» X H X H H H M H J" CM ON KN OJ f~ ON rH CM K*N VO -rj- CO KN ON LA VO CO O rH H LP- VO t - 00- O -H" D— LA O rH OJ H OJ CO H OJ OJ HH . H CVt OJ H OJ rH OJ KN OJ rH H VO KN C—_rt O -* .H IACO 00 rH r H O KNVO On D— On OJ C-VO CO LA On OJ -* lA KN OJ O rH O! CO rH Ol CO r-J ' OJ H rH r-l OJ CO OJ r-t OJ rH H OJ rH (A r- o -=f vo _* la cm KNh-H c.--f r-co no co oj 01 kn la la hho to ovo o onco lA r-l H r\j OJ CO rH CO r-l r H CO rH r-l OJ OJ rH rH r-l Ol CM OJ CO lA_=f ON O H OJ Cvl r~; K-N.-0- oi cvl vo a) cm vo >a onco i>- [— lavq o o r~ tA-d- on to OJ OJ rH OJ r-l OJ r-l rH rH OJ rH OJ r-l KN CM rH OJ co irwo h la oj cu o o m on_-j- _-j- k\ t---co on on oj rA c— rH knvo _h- o co vo h r~ HHHtll CO rH Ol KN rH r-l rH r-l CO OJ H <H CO Ol OJ OJ . CN J -■-± CO ITS NO K"N f-- ON h- r-l CD NO VO r-^ f- O CO -J' OJ O IA IAO0 00 ,- H CO ON KN 0N-* (A H r-l ,-i rCN OJ H 0J CVJ H r-l OJ rH OJ r-l rH OJ OJ OJ CO r-l 0J ON no CO KN KN r-i rH no [-- vo tA CO r Tj- 0J O t:~ H — r ON la KN CM CO CO O On O f— ~-lr LA OJ CM 00 OJ l- 1 CO rH r-l rH rH H CM rH r-l rH OJ 00 tA r-l CO OJ UN IA 'OJ VO IA IAVO H ~t KNVO .-0 O CO Ol CVI l>- KN D-00 rH CO [-" CN _tJ- On ON ON O -H" rH OJ rH r-l CO CO CM r-l r-l ,0 OJ CVI H CO H IA r-l H OJ Ol OJ CN VO LA f— rH lA-d" O K"', LA 00 C*- h~00 IA-=J- .H VQ KN 0\ CD ..-J- ON OJ VO CO rH CO CO ON OJ CO (A r-l CVI r~] 01 r-l H r-l r-l OJ OJ H r-l H rH OJ Ol OJ OJ O! ON CM VO O NO IA ON O CO GO -".: t>- - T IA t— rH r-l IAVQ (A fA ON OJ --I" CD r-l IN- LA CO CO -H to, OJ r-l rH CO 00 OJ Ol r-| CM rH H CM OJ rH r-l H CM rH X I O IA C— rA IA rH 0\ rH _M CM VO IT-CO H VO IA O CO In-- O CO -d- KN C> 4- VO CO On Cvl »- ! rH r-l r-t CO OJ rH r-l CO rH OJ OJ CO 0J IA H rH OJ rH CO OJ .H p ■H O -t rA L^-.cf- On CM VO VO IA CO !A h-00 .--I CD rA t-- r-l -t O ; "' OJ l -0 !/,'*>. CO On ON CO CO ON -P •P (D pj., ■H EH O ,0 O) O! OJ 0J r-l rH H fCNrl CO OJ r-l r-l rH Ol CO 0J r H i H I rCN ONCO -4- Ht A OJ CO -H" IA00 H OJ VO C"N ON oj CD VO KN CD rH LCN I. - L^-VO O rA t— -H H CO CO H H O! CM CO Ol r-J CM H IA rH H rH H rH OJ CM rH 00 IH i I- J fi VO IA On Ol CO VO O CD C~- 00 !>-,* IA lA^t rH ro- On CO rAVQ CM IA_"|- r-l O ON IA CO -H r-i rH ,H CO CO O! hA rH r-l 00 CM CM CM OJ r-l H rH i-l CO CM r-l fAVO CVI r-l KN IAO0 NO .-I -j- IACO CO CD ON O !>- H IA ON LA l"-VO OJ -H' CO --t O, t~- O r- 1 CVI OJ OJ CM Oi r-l r-l ' r-l Ol tA r~! r-l H OJ rH r-l OJ r-l CO CVI VO 01 VO r-l LA OJ GNCO i'A C— O (■-' ONCO C LA !>- C' H •>- -^ CO CO -H- -t LA tA'O IA ON CO Ol CVI OJ OJ CO Ol i-i r-l r-l UN r-l ^ Ol CO rH r-l OJ rH r-l r-l O L-- LA O !:- -_H- rA CM Ol fA'-O On H CO no -H r O ,-H (A On H LCN n" 1 , l N C--VO CM O CO .4 0J .H KN OJ r-l r-l OJ OS OJ .• -I r-l i-i OJ rH rH OJ '-' CI r-l CO CM O O DJ -P O EH hi H i LAVO rA-rh ON Ol r-l CO ON.CO t— r-l CO CD rH IAVO KN O IA C~VO ON KN O CO -t CO ^--t r-l I r-l OJ CO OJ r-l OJ r-l rH i-l K"\ OJ CO ,-l rH CM rH OJ CO -H CO rH j H i OJ r-l IA KN LAVO KN ON CD UN H VO ON i— C I.— VO CD -H/ r-l CO CM -* CO Ol O- Oi4 I' NCO H | r ! OJ OJ r-l r-l H CO CM OJ rH H CO IA rH rH OJ OJ CM CVI r-l rH H ON t— LA CD O H r-l f-.4 U~\ 'CO CO -t IA O CO [-- KN v vO CM _-J- ONCO LOs'-O CO VO KM r-l ON | OJ CM CM H OJ H rH OJ r-l CVI (A r-l r H OJ CO OJ CM r-l r-l r-l H tlH !h rH 267 SUBJECT INDEX Page Ac cl imat i zat i on ; \ ; ; .,„•. 155 Acid phosphate ;;,.... ; ; ; '33 Agricultural research, support for 2 Agricultural science, development of : ; " 9 Agronomic experiments, criticisms of : 167 general types of .19 Agronomi c re sear ch , re suit s of .: 5 rise of „ :.: :: . 1 status of 1 -8 trends in , : .- : 6 Analogy „ 2k Analysis of covariance, general 113 computation of 11^ use of , 119 Analy s i s of variance , general , n 103-112 one criterion of classification 103 two or more criteria of classification 108 and agricultural experiments 110 for computation of heterogeneity I3I4- Angular transformation, table of 262 Areas under normal curve, table of „ 251 Arithmetic average or mean :.....:: .-. 38 Bacteriology, development of 13 Basic plant sciences, history of '. .:...: " 9-lS Binomial d i s t r ibut ion '. : 70 -7^ Binomial distribution, applications of -'71 nature of - .; k2 Border effect : 162 Burnt limestone ; : 33 Calcium-magnesium ratio ; : :,..'. ; .■ '. 32 Calibration of grain drills : 2k0 Cell in relation to inheritance :.... 12 Chi square (x 2 ) : 75-86 Chi square (X 2 ), applied to several genetic families ....'. 78 correction for continuity :.'.:.: 83 distribution of „ : 75 null hypothes is and '., 83 partition into components 80 probability tables for : 76 table of : ; _ :.- 258 use in homogeneity test 20o C oef f i c ient of var iab i 1 ity ^5 Competition,, and other plant errors :...■: .;... :•:•:. :.„.'! 55 -166 concept ,--..::*. ....' .".;. L...... 156 Complex experiments, application to variety trials •.:.;.............. .".:..■■„'....'....!. 195-210 versus simple experiments •......-. ,...........':.'.....'..'..: loo Confounding, in factorial experiments ;. ....;. , ■ 221-229 in 2 by 2 by 2 experiment '.' ..-:..: 223 partial in 2 by 2 by 2 experiment 226 263 Page Constants, used to describe distributions „. . kj of binomial distribution - 70 C ont inuous select ion 3 2 Corrections for uneven stands 159 Correlation, nature of : B7 Correlation coefficient ■..■. o9>132 Correlation coefficient, calculate on of 115 for error of a difference $h significance of • 95 Correlation surface 91 Co variance, analysis of 115 Crop rotation experiments 17° Crucial tests ... . 22 Cultural experiments , 177 Date headed (small grains) 2*t2 Date ripe (small grains) 2*4-3 Deep plowing 52 Degrees of freedom 62 Design, "basic principles of X&J relation of type of experiment to 17 6 Differences in stand in plots , .....158 Discovery, methods of : : 2*4 Duration of tests lo9 Dust mulch 51 Early agronomic experiments o Efficiency factor 230 Empirical iaethcd ■,,..-. 19 Error, control of 171 reduction of with uniformity data : 120 sources of : 28 types of . 2b" Errors, in experimental work ^<o-.> related to plant : 29 related to soil N ■...- 3° Evidence ' 22 Evolution and genetics • , - 11 Experiments., crop rotation 17° cultural ■ , .- 17/ factorial ,...; •. 221 fertilizer , ■ 177 pasture 17" Experiment stations, establishment of : funds for , 5 Factorial experiments ■ - 221 Fallacies in agi'omomy 31 Fertilizer experiments 1 II Field experiment ,' nature of ^0 sources of variation in , : 2Q Field observations 241 Field plots : l^ 2 Fit of observed data to normal curve ■ 19 Frequency distribution -37 2 69 Page Genetics , modern developments in„ „ 13 Generalized standard error methods . 103 Goodness of fit, chi square test for 77 Graphi cal repre s entat i on „ kl Grouping of data kO Harvesting experimental plots _ 2kj> Homogeneity test ....; „ 206 Hybridization of plants ....; 11 Hypothesis, formulation of 21 null : 21 qual i t ies of _ _ ,„ 2 1 Incomplete experimental records : 179 Incomplete "block experiments 230 Independence , test for ., 8l Inductive method 17 Inter -plot compet i t i on _ : 160 Intra -plot competition : 157 Laboratory and greenhouse experiments. 20 Land preparation _ „ ,. 1 238 Large sample theory , „ 55 Lat in square „ „ , ,...,., 17^ Laws of inheritance _. 13 Logic in experimentation 17-27 Mean of replicated variates ....„ , 39 Means , of 2 independent samples - - 63 of paired samples _ 65 Measurements _ 37 *2i+l Measures of central tendency _ - ^2 Mechanical procedure in field experimentation _ 238-2*4-8 Mendelian ratios, standard errors of 72 Methods, to plant field plots '.... ~ 239 to plant nursery plots 2^0 to plant row crops - 2k0 Missing values .: 179 Moisture content of harvested crop : 155 Neparian logarithms, table of :...: 259 Nitrogen in plants ~ ■ - : -»-. : 10 Normal distr ibut ion : ^2 Null hypothesis „ - 21,83 Nursery plots _ „ 1... '. 1^2 Outline of experimental tests : , I67 Pasture experiment s 178 Percent lodged ~ - - -- • 2^2 Percentage data, class if icat ion of 207 trans format ion of ...-..- : 207 Personal equation „ - _. ^-,29 Plant competition - 155 Plant height 2^2 Plant individuality -....15^ 270 Page Plant nutrition • 10 Plant pathology lit Plots, early use of : : * : .,':....';... ll|0. kinds of ; Ik?. size and shape '. ,. 1^-1 Plot arrangements ; ' '.' ', 170 Plot efficiency, calculation of : IV7 Plot replication lk$) Plot shape, practical considerations in : ,. : 1J+7 relation to accuracy Xk6 Plot size, factors that influence ' ; lUl for various crops .' ll(-5 relation to accuracy ,.., , lk$ Pol s son di st r i "but i on .' ^3>73 I'otometers 51 Preliminary tests : I08 Principle of extremes ' , I67 Probable errors of statistical constants 61. Probability, and normal curve , 55 determinations with small samples 62 relation of binomial distribution to 70 tables for , : -)b theory of ; .'.....'. "A Quadrat and other sampling methods l86~19^ Questionnaires and surveys ' : 20 Random numbers 266 Randomized blocks 172 Rate and date tests ! ".....! lol. Rate of seeding , 239 Regression, linear ■ ■ Qo computation for grouped data ■ ' 98 significance of 99 Regression coefficients, calculation of ". 115 Regression equation, substitution in ' 118 Replication, history of '. lk$ in experimental work ; 1^9 reduction of error by 150 relation to soil heterogeneity 7 169 Replications, number of , lpl Research 17 Roguing plots _.....;'. 2^2 Rules for computet icn ; 38 Sampling, economy in ; 188 practices used in 190 Science, scope of 1-7 among ancients ■ ' : 17 Seed preparation 258 Selection of seed com ,..- J2 Sex in plants ? 11 Sheppard'e correction ' 1+5 Short-cut methods for commutation h-o 271 S Ignif i cance , le vel s of _ 5^ of means 63 of correlation coefficients 95 Significant differences 60 Simple vs. complex experiments 168 Small samples, special case of 62 in biological research 62 Soil heterogeneity, amount of _ 135 causes of 136 correct ions for 137 relation to experimental field 137 universal ity of 131 Split -plot experiments 211-220 Stand counts and estimates 2U-1 Standard deviat i on .... ^3 Standard errors, of statistical constants 57-60 of Mendel ian ratios 72 Statistical terms 37 Statistical methods, general applicability ^7 St at ist i cs , in experimental work .". 37 as basis for generalization ..._ ■. 5^+ Storage of seed 2^7 Student ' s pairing method , 65 Symmetrical incomplete block experiments 230-237 Tables . 251-266 Theory of evolut ion 1 1 Threshing, field plots and nursery plots 2^5 Types of frequency distributions ^2 Uneven plant distribution _ 157 Uneven stands, corrections for ,. 159 Uniformity trial data - 13 1 Values of n F " and "t " 25^ Variance, analysis of 103 Variations in seasons 30 Variety and similar tests 176 y _i UNIVERSITY OF FLORIDA 3 12b2 DMM1015M D MARS,ON SCENCE UBRARy Date Due Due Returned Due AUB 2 519ft OCT 2 31 9 OR 4WS JAW *'9 L 24 1978 WOf 1 8 WW »4 — ~ MAfl 8 PTBw 8 «w 2Q 199 / DEC n 7 2QM. LIBMRY ^ Returned